Drawing inspiration from manufacturing production processes like chemical and steel manufacturing, the distributed blocking flowshop scheduling problem with preventive maintenance and sequence-dependent setup times (DBFSP/PM/SDST) is studied. First, it is described by a mixed-integer linear programming model with the objective of minimizing the total flowtime. Second, we propose a Q-learning and learning mechanism co-driven approach, integrating it into the discrete grey wolf optimization algorithm (DGWO_Q). In the algorithm, the neighborhood search structure is adjusted using Q-learning based on dynamic feedback from the environment. The balance between exploration and exploitation can be improved by introducing learning mechanisms in the search phase that can guide the grey wolf as it approaches the prey. Furthermore, a differential hunting strategy is designed to prevent the algorithm from falling into local optima. Third, a heuristic that enhances the quality of the initial solution is proposed for the problem characteristics. Finally, the proposed DGWO_Q is compared with four conventional efficient algorithms in numerical experiments on 225 instances of different sizes. Experimental results show that the DGWO_Q algorithm demonstrates excellent performance across test cases of various scales, effectively reducing production cycle time, setup times and the impact of maintenance downtime on production efficiency. It provides an efficient intelligent optimization approach for solving the complex scheduling problem.
