Processing, Please wait...

  • Home
  • 📊 Statistics
  • About Us
  • 📺 Tutorial
  • Search:
  • Advanced Search

Growing Science » Tags cloud » Reinforcement Learning

📚 Highly Cited Articles

  • Jaya Algorithm
  • Rao Algorithm
  • TLBO Algorithm
  • Discrete Firefly
  • ChatGPT and Blended Learning

Journals

  • IJIEC (777)
  • MSL (2648)
  • DSL (690)
  • CCL (544)
  • USCM (1099)
  • ESM (428)
  • AC (562)
  • JPM (323)
  • IJDS (992)
  • JFS (101)
  • HE (37)
  • SCI (41)

🔑 Keywords

Supply chain management(168)
Jordan(167)
Vietnam(153)
Customer satisfaction(122)
Performance(116)
Supply chain(113)
Competitive advantage(98)
Service quality(98)
Artificial intelligence(95)
Tehran Stock Exchange(94)
Sustainability(91)
SMEs(91)
optimization(88)
Trust(84)
Financial performance(84)
TOPSIS(83)
Job satisfaction(81)
Knowledge Management(80)
Social media(79)
Genetic Algorithm(78)


» Show all keywords

✍️ Authors

Naser Azad(82)
Zeplin Jiwa Husada Tarigan(67)
Mohammad Reza Iravani(64)
Endri Endri(45)
Muhammad Alshurideh(42)
Hotlan Siagian(40)
Dmaithan Almajali(38)
Jumadil Saputra(36)
Muhammad Turki Alshurideh(35)
Ahmad Makui(33)
Barween Al Kurdi(32)
Hassan Ghodrati(31)
Basrowi Basrowi(31)
Sautma Ronni Basana(31)
Mohammad Khodaei Valahzaghard(30)
Haitham M. Alzoubi(29)
Shankar Chakraborty(29)
Ni Nyoman Kerti Yasa(29)
Sulieman Ibraheem Shelash Al-Hawary(28)
Prasadja Ricardianto(28)


» Show all authors

🌍 Countries

1. Algeria (52)
2. Angola (1)
3. Argentina (22)
4. Armenia (1)
5. Australia (52)
6. Austria (2)
7. Bahrain (26)
8. Bangladesh (56)
9. Belarus (3)
10. Belgium (3)
11. Benin (2)
12. Benin Republic (1)
13. Bhutan (1)
14. Bosnia and Herzegovina (1)
15. Botswana (8)
16. Brazil (39)
17. Brunei (1)
18. Bulgaria (1)
19. Burkina Faso (1)
20. Cameroon (1)
Total: 122 countries

Show all countries
Sort articles by: Volume | Date | Most Rates | Most Views | Reviews | Alphabet
1.

To reduce maximum tardiness by Seru Production: model, cooperative algorithm combining reinforcement learning and insights Pages 65-82 Right click to download the paper Download PDF

Authors: Guanghui Fu, Yang Yu, Wei Sun, Ikou Kaku

doi 10.5267/j.ijiec.2022.10.002 Crossmark

Keywords: Cooperative algorithm, Reinforcement learning, Maximum tardiness, Seru production

Abstract:
The maximum tardiness reflects the worst level of service associated with customer needs; thus, the principle that seru production reduces the maximum tardiness is investigated, and a model to minimize the maximum tardiness of the seru production system is established. In order to obtain the exact solution, the non-linear seru production model with minimizing the maximum tardiness is split into a seru formation model and a linear seru scheduling model. We propose an efficient cooperative algorithm using a genetic algorithm and an innovative reinforcement learning algorithm (CAGARL) for large-scale problems. Specifically, the GA is designed for the seru formation problem. Moreover, the QL-seru algorithm (QLSA) is designed for the seru scheduling problem by combining the features of meta-heuristics and reinforcement learning. In the QLSA, we design an innovative QL-seru table and two state trimming rules to save computational time. After extensive experiments, compared with the previous algorithm, CAGARL improved by an average of 56.6%. Finally, several managerial insights on reducing maximum tardiness are proposed.
Details
  • 34
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: IJIEC | Year: 2023 | Volume: 14 | Issue: 1 | Views: 1667 | Reviews: 0

 
2.

Simulation and modeling of human decision-making process through reinforcement learning based computational model involving past experiences Pages 366-378 Right click to download the paper Download PDF

Authors: Nimisha Gupta, Mitul Kumar Ahirwal, Mithilesh Atulkar

doi 10.5267/j.dsl.2022.9.001 Crossmark

Keywords: Past experiences, Decision-Making, Reinforcement Learning, Learning rules, Iowa Gambling Task

Abstract:
Experience plays a vital role in the decision-making (DM) process. In this paper simulation, modeling, and analysis of past experience over DM has been done using the Iowa gambling task (IGT). The Human DM process is very complex and difficult to model through computational methods because it is a subjective type of process and varies person-to-person. Therefore, this study is an attempt to simulate a DM model similar to the human DM process. For this collection of real data was done and was provided as input to the developed eight Reinforcement Learning (RL) models. The result shows that the performance of the model based on Prospect Utility (PU) learned with Decay Reinforcement Rule (DRI) and Trial Dependency Choice (TDC) is better as compared to other models. It is observed from the analysis of data and also validated that simulation and models output that the experienced group performs better than inexperienced.
Details
  • 51
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: DSL | Year: 2022 | Volume: 11 | Issue: 4 | Views: 1370 | Reviews: 0

 
3.

Dynamic flexible job shop scheduling using greedy actor–neural critic PPO reinforcement learning algorithm Pages 819-840 Right click to download the paper Download PDF

Authors: Selva Kumar Chandrasekar, Hariss Kumar Shanmugaprabu, Aswath Mani, Nishanth Sami Raja Murugan

doi 10.5267/j.jpm.2026.4.009 Crossmark

Keywords: Dynamic flexible job shop scheduling, Reinforcement learning, Makespan minimization, Energy consumption minimization, Machine utilization, Disruptions

Abstract:
Dynamic Flexible Job Shop Scheduling (DFJSS) is a critical challenge in smart manufacturing due to dynamic job arrivals, machine breakdowns, routing flexibility, and multi-objective performance requirements. Conventional dispatching rules and metaheuristic algorithms often fail to provide adaptive real-time decisions. This paper proposes a Greedy Actor Neural Critic PPO based Reinforcement Learning (GANC-PPO-RL) framework to minimize makespan and total energy consumption. The problem is formulated as a Markov Decision Process, where machine availability, job progress, queue status, processing times, and energy levels define states, and job–machine assignments define actions. A multi-objective reward penalizes higher makespan and energy usage. The PPO algorithm ensures stable policy updates using a clipped objective. Results show makespan within ±2% of benchmarks for small instances and about 23% reduction for larger cases, with 16–20% lower makespan and 15–25% lower energy consumption compared to RL-QL.
Details
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: JPM | Year: 2026 | Volume: 11 | Issue: 3 | Views: 8 | Reviews: 0

 
4.

Project portfolio management in the age of artificial intelligence: A review of challenges, key features, and future research directions Pages 247-272 Right click to download the paper Download PDF

Authors: Esmaeil Taheripour, Seyed Jafar Sadjadi

doi 10.5267/j.jpm.2025.9.006 Crossmark

Keywords: Project portfolio management, Artificial intelligence, Machine learning, Deep learning, Neural network, Reinforcement learning

Abstract:
The rapid advancement of artificial intelligence (AI) has revolutionized project portfolio management (PPM), as it has in many other areas, by introducing data-driven methods that improve decision-making, risk assessment, and strategic alignment. Unlike traditional project management, which emphasizes individual project execution, PPM requires balancing multiple initiatives to optimize value creation and resource allocation. This paper presents a systematic review of scientific research on the integration of AI techniques into PPM, focusing on their applications, benefits, and challenges. The review synthesizes findings from 73 peer-reviewed studies covering a wide range of AI methodologies, such as machine learning, deep learning, neural networks, reinforcement learning, natural language processing, and hybrid optimization models. These approaches have been applied in diverse fields, including information technology, construction, healthcare, defense, energy, and telecommunications. Analysis shows that AI significantly improves project portfolio performance by predicting project outcomes, identifying interdependencies, optimizing resource allocation, and supporting adaptive strategies in dynamic environments. In addition, advanced AI tools provide project portfolio managers with predictive and prescriptive analytics, transforming PPM from reactive monitoring to proactive governance. Despite these advances, challenges remain regarding data quality, organizational readiness, and interpretability of AI-based models. Concerns about transparency, ethical implications, and integration with existing management frameworks also hinder wider adoption. However, recent developments indicate a growing trend toward hybrid systems that combine AI with traditional decision-making models, increasing both accuracy and practical applicability. This review contributes to theory and practice by synthesizing current knowledge, highlighting research gaps, and identifying emerging directions such as the use of large language models, ensemble methods, and sustainability-focused project portfolio optimization. The findings highlight the transformative potential of AI in advancing PPM and provide valuable insights for researchers and practitioners seeking to design smarter, more adaptive, and more sustainable project portfolio management strategies.
Details
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: JPM | Year: 2026 | Volume: 11 | Issue: 1 | Views: 1124 | Reviews: 0

 
5.

An IoT-enabled reinforcement learning-driven ground robot for precision navigation and smart interaction in dynamic environments Pages 773-782 Right click to download the paper Download PDF

Authors: Indra Kishor, Udit Mamodiya, Mohammed Almaayah, Mansour Obeidat, Rami Shehab, Theyazn H. H. Aldhyani

doi 10.5267/j.ijdns.2025.8.007 Crossmark

Keywords: Reinforcement Learning, Edge Computing, Ground Robotics, IoT Communication, Semantic Mapping, Human–Robot Interaction, Raspberry Pi

Abstract:
Autonomous ground robots are increasingly relied upon in dynamic environments where reliable navigation and context-aware interaction are essential. However, existing robotic control systems often rely on cloud-based reinforcement learning (RL) frameworks or static algorithms that fail to adapt in real-time to noisy, unpredictable scenarios. These models typically overlook the constraints of edge deployment and lack robust integration with human interaction modalities such as voice and semantic object awareness. To address these limitations, this work proposes a fully embedded, IoT-enabled ground robot powered by a reinforcement learning-based adaptive control framework. The system leverages Raspberry Pi 4B+ as its core computational unit, integrating MQTT-driven communication, multimodal interaction through speech and vision, and lightweight policy convergence for obstacle-aware navigation. A novel RL-based state-action pipeline is trained and deployed entirely on-device, ensuring real-time responsiveness without external computation. Experimental evaluations show that the proposed framework reduces navigation errors by 22% and improves interaction latency by 37% over traditional PID and A*-based systems. The RL model converges in under 2200 episodes, with stable reward curves and high reliability across variable acoustic and physical terrains. This study showcases how low-cost, edge-based robots can achieve high autonomy and situational awareness contributing to future advancements in resilient, self-adaptive robotic systems within smart and resource-constrained environments.
Details
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: IJDS | Year: 2025 | Volume: 9 | Issue: 4 | Views: 980 | Reviews: 0

 
6.

Reinforcement learning-driven feature selection for enhanced classification in cybersecurity: Applications in IoT security and malware detection Pages 813-822 Right click to download the paper Download PDF

Authors: Hanaa Fathi, Ola Malkawi, Arar Al Tawil, Amneh Shaban, Dyala Ibrahim, Mohammad Adnan Aladaileh

doi 10.5267/j.ijdns.2025.8.003 Crossmark

Keywords: Feature Selection, Reinforcement Learning, Machine Learning, XGBoost, Random Forest, Multi-Layer Perceptron, IoT Security, Malware Detection

Abstract:
The effectiveness and efficiency of a machine learning model can be improved by feature selection, especially for high-dimensional datasets such as in cybersecurity. The proposed approach utilizes an enhanced version of the Rainbow agent with a memory storage structure. The suggested approach is assessed using two benchmark datasets namely RT-IoT2022 which is targeted towards IoT network security and the Android Malware Detection dataset which is meant for mobile security. The specification of the reinforcement learning model has been trained for 20 epochs and it is progressively enhanced through feature subsets to enhance classification accuracy. The results show that the AUC scores continuously increase were the one for RT-IoT2022 achieves 0.91 and Android at 0.93. Three well-known classifiers XGBoost, Random Forest and multi-layer perceptron (MLP) are used to test the power of the selected features. The outcome evaluation on RT-IoT2022 dataset shows that Random Forest achieved maximum accuracy (99.48%), followed by XGBoost (99.16%), while MLP secured 94.04% accuracy. In the Android malware dataset, XGBoost model gave the best accuracy of 89.50%, followed closely by Random Forest with 87.00% and MLP with 86.50%. This clearly shows that reinforcement learning based feature selection enhances accuracy and reduces computation. The research emphasizes utilizing dynamic feature selection in any cyber security application. The future will experiment with incorporating deep reinforcement learning as well as hybrid selection.
Details
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: IJDS | Year: 2025 | Volume: 9 | Issue: 4 | Views: 828 | Reviews: 0

 
7.

Solving blocking flowshop scheduling problem with makespan criterion using q-learning-based iterated greedy algorithms Pages 85-100 Right click to download the paper Download PDF

Authors: M. Fatih Tasgetiren, Damla Kizilay, Levent Kandiller

doi 10.5267/j.jpm.2024.2.002 Crossmark

Keywords: Q-learning-based iterated greedy algorithms, Reinforcement learning, Blocking flowshop scheduling problem

Abstract:
This study proposes Q-learning-based iterated greedy (IGQ) algorithms to solve the blocking flowshop scheduling problem with the makespan criterion. Q learning is a model-free machine intelligence technique, which is adapted into the traditional iterated greedy (IG) algorithm to determine its parameters, mainly, the destruction size and temperature scale factor, adaptively during the search process. Besides IGQ algorithms, two different mathematical modeling techniques. One of these techniques is the constraint programming (CP) model, which is known to work well with scheduling problems. The other technique is the mixed integer linear programming (MILP) model, which provides the mathematical definition of the problem. The introduction of these mathematical models supports the validation of IGQ algorithms and provides a comparison between different exact solution methodologies. To measure and compare the performance of IGQ algorithms and mathematical models, extensive computational experiments have been performed on both small and large VRF benchmarks available in the literature. Computational results and statistical analyses indicate that IGQ algorithms generate substantially better results when compared to non-learning IG algorithms.
Details
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5

Journal: JPM | Year: 2024 | Volume: 9 | Issue: 2 | Views: 2007 | Reviews: 0

 

® 2010-2026 GrowingScience.Com