La Recherche en France - CF202545582 Learning Algorithms for Designing Undeceivable Policies to Foster Sustainable Behavior

Description

Personalized Demand-Side Mitigation Strategies (PDSMS) encourage agents to make sustainable choices.[IE21] A regulator learns agents preferences by observing their choices, and adapts signals, e.g., incentives or prices.[Ar18] State of art PDSMS[Ar23,As21] are based on Random Utility Theory (RUT), assuming agents are honest, making choices to maximize their utility.[Be19,§3.1] We study instead the case where agents may be deceptive, making choices to manipulate the regulator and get favorable signals. Our objective is to answer the following research questions: Under which conditions deceptive agents cancel-out the benefits of PDSMS? How to make PDSMS robust to them? This remains an open question, highlighting the novelty of our project. We build our approach on recent advances in AI & Game Theory,[Gan20,Xu21] but our originality is that we will explicitly model the regulators learning process (missing so far) and show that it can deter agents from deceiving. To this aim, we will devise a novel reinforcement learning formulation rooted in RUT and resort to Mechanism Design to make PDSMS robust to deception. [Bo15] Lab experiments will validate our findings. PDSMS can contribute to sustainability (e.g., reducing pollution up to 40% [IP23,IP22]). The theoretical framework resulting from our project can unlock their full potential.

Compétences requises

When: starting date is flexible. Duration: 3 years Requirements: Excellent mathematical modeling and analytical skills, good programming skills To apply: Please send your CV, an explanation of 5 lines explaining why you are the best fit for this position (with factual non-vague or generic elements), all the marks of your BSc and MSc level courses; Sending your ranking is not mandatory (but it is a big plus). Send all this material to andrea.araldo@telecom-sudparis.eu

Bibliographie

[Ar18] Araldo, Seshadri, Ben-Akiva et al. System-level optimization of multi-modal transportation networks TRR (2019)
[Ar23] Araldo et al., Personalized Incentives with Constrained Regulators Budget Transportmetrica A (2023)
[As21] Ascarza et al. Eliminating unintended bias in personalized policies using bias-eliminating adapted trees Proceedings National Academy of Sciences 2022
[Ba00] Orme, Comparing hierarchical Bayes draws and randomized first choice for conjoint simulations SawtoothSoftware (2000)
[Ba24] Tasnim, Mayesha, et al. 'Strategic manipulation of preferences in the rank minimization mechanism.' Autonomous Agents and Multi-Agent Systems 38.2 (2024): 44.
[Be85] Ben-Akiva, Moshe E., and Steven R. Lerman. Discrete choice analysis: theory and application to travel demand. Vol. 9. MIT press, 1985.
[Be91] Ben-Akiva et al. Analysis of the reliability of preference ranking data Journal of Business Research (1991)
[Be19] Ben-Akiva, McFadden, Train. Foundations of stated preference elicitation: Consumer behavior and choice-based conjoint analysis Foundations and Trends in Econometrics (2019)
[Bo15] Börgers, Tilman. An introduction to the theory of mechanism design. Oxford University Press, USA, 2015.
[Bo20] Botta, Marco, and Klaus Wiedemann. 'To discriminate or not to discriminate? Personalised pricing in online markets as exploitative abuse of dominance.' European Journal of Law and Economics 50 (2020): 381-404.
[Br88] Breton, Michele, Abderrahmane Alj, and Alain Haurie. 'Sequential Stackelberg equilibria in two-person games.' Journal of Optimization Theory and Applications 59 (1988): 71-97.
[Br18] Aziz, H., Brandl, F., Brandt, F., & Brill, M. (2018). On the tradeoff between efficiency and strategyproofness. Games and Economic Behavior, 110, 1-18.
[Ch15] Hu, Xianbiao, Yi Chang Chiu, and Lei Zhu. 2015. Behavior Insights for an Incentive-Based Active Demand Management Platform. International Journal of Transportation Science and Technology 4 (2): 119133. http://dx.doi.org/10.1260/2046-0430.4.2.119.
[Ch21] Li, Tianhao, Peng Chen, and Ye Tian. 'Personalized incentive-based peak avoidance and drivers travel time-savings.' Transport Policy 100 (2021): 68-80.
[Ch22] Sen, Suman, Michael B. Charles, and Jennifer L. Harrison. 'Usage-based road pricing and potential equity issues: A study of commuters in South East Queensland, Australia.' Transport policy 118 (2022): 33-43.
[Cl71] E. H. Clarke, Multipart pricing of public goods, Public Choice, vol. 11, pp. 1733, 1971.
[Cl22] Clempner, J. B. (2022). Learning machiavellian strategies for manipulation in Stackelberg security games. Annals of Mathematics and Artificial Intelligence, 90(4), 373-395.
[Da19] Ben-Akiva, Danaf, et al. Online discrete choice models: Applications in personalized recommendations Sec. 4.2, Decision Support Systems (2019)
[Da13] Anshelevich, E., Das, S., & Naamad, Y. (2013). Anarchy, stability, and utopia: Creating better matchings. Autonomous Agents and Multi-Agent Systems, 26(1), 120140.
[Da25] Mark Daychman, Andrea Araldo, Amir Brudner, Ravi Seshadri, 2025 Towards Undeceivable Personalized Policies to Promote Sustainable Behavior, International Conference on Agents and
Artificial Intelligence (ICAART) - Abstract Track
[De01] De Borger, Bruno. 2001. Discrete choice models and optimal two-part tariffs in the presence of externalities: optimal taxation of cars. Regional Science and Urban Economics 31 (4): 471504
[Dr19] Drutsa et al., Optimal Pricing in Repeated Posted-Price Auctions, NeurIPS (2019)
[Gan20] Birmpas, G., Gan, J., Hollender, A., Marmolejo, F., Rajgopal, N. and Voudouris, A., 2020. Optimally deceiving a learning leader in stackelberg games. Advances in Neural Information Processing Systems, 33, pp.20624-20635.
[Go20] Paul W. Goldberg, Edwin Lock, and Francisco Marmolejo-Cossío. Learning strong substitutes demand via queries. In Proceedings of The 16th Conference on Web and Internet Economics (WINE 20), page to appear, 2020.
[Gn00] Gneezy et al. Pay enough or don't pay at all. The quarterly journal of economics (2000)
[Gr73] T. Groves, Incentives in teams, Econometrica, vol. 41, pp. 617631, 1973.
[Ha20] Rosenfeld, Ariel, and Avinatan Hassidim. 'Too smart for their own good: Trading truthfulness for efficiency in the Israeli medical internship market.' Judgment and Decision Making 15.5 (2020): 727-740.
[IE21] Do we need to change our behaviour to reach net zero by 2050?, International EnergyAgency (2021)
[IM23] Public Perceptions of Climate Mitigation Policies, International Monetary Fund (2023)
[IP22] Demand, services and social aspects of mitigation, Supplement, IPCC, Sec.5.SM.2 (2022)
[IP23] Climate Change 2023 Synthesis, IPCC, Fig.4.4 (2023)
[Ka21] Kallus, Nathan, and Angela Zhou. 'Fairness, welfare, and equity in personalized pricing.' Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. 2021.
[Ke23] Collina, Natalie, Eshwar Ram Arunachaleswaran, and Michael Kearns. 'Efficient stackelberg strategies for finitely repeated games.' Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems. 2023.
[Kr20] Sessa, P. G., Bogunovic, I., Kamgarpour, M., & Krause, A. (2020). Learning to play sequential games versus unknown opponents. Advances in neural information processing systems, 33, 8971-8981.
[IA06] https://iatbr.weebly.com/award-winners.html
[IN17] https://www.informs.org/Recognizing-Excellence/Award-Recipients/Moshe-Ben-Akiva
[La04] Sebastien M. Lahaie and David C. Parkes. Applying learning algorithms to preference elicitation. In Proceedings of the 5th ACM conference on Electronic commerce (EC04), pages 180188, 2004.
[Le19] [1] Le Bras, H. (2019) . La voiture, les « gilets jaunes » et le Rassemblement national. Études, Avril(4), 31-44. https://doi.org/10.3917/etu.4259.0031.
[Ma21] Ferguson, Bryce L., Philip N. Brown, and Jason R. Marden. 'The effectiveness of subsidies and taxes in atomic congestion games.' IEEE Control Systems Letters 6 (2021): 614-619.
[McF00] McFadden, Nobel Prize lecture (2000)
[Mo15] Mohri et al., Revenue optimization against strategic buyers NeurIPS (2015)
[Ni20] Tang, Yili, Yu Jiang, Hai Yang, and Otto Anker Nielsen. 2020. Modeling and optimizing a fare incentive strategy to manage queuing and crowding in mass transit systems. Transportation Research Part B: Methodological 138: 247267.
[OE18] OECD Secretariat. (2018). Personalised pricing in the digital era. Report published on 28.11.2018. Retrieved Feb, 2025 from http://www.oecd.org/daf/competition/personalised-pricing-in-the-digital-era.htm.
[Po14] Firpo, Sergio, et al. 'Evidence of eligibility manipulation for conditional cash transfer programs.' EconomiA 15.3 (2014): 243-260.
[Pr09] Merugu, Deepak, Balaji S Prabhakar, and N Rama. 2009. An incentive mechanism for decongesting the roads: A pilot program in bangalore. In Proc. of ACM NetEcon Workshop
[Pr15] Yue, Jia Shuo, Chinmoy V Mandayam, Deepak Merugu, Hossein Karkeh Abadi, and Balaji Prabhakar. 2015. Reducing road congestion through incentives: a case study. In Transportation Research Board 94th Annual Meeting, Washington, DC,
[Pr16] Hayes, Barry, et al. 'Residential demand management using individualized demand aware price policies.' IEEE Transactions on Smart Grid 8.3 (2016): 1284-1294.
[Ro81] Small, Kenneth A., and Harvey S. Rosen. 1981. Applied Welfare Economics with Discrete Choice Models. Econometrica 49 (1): 105130.
[Ro13] Rostamizadeh et al., Learning prices for repeated auctions with strategic buyers NeurIPS (2013)
[Ro19] Rotemberg. Equilibrium effects of firm subsidies American Economic Review (2019)
[Se24] Xie, Y., Seshadri, R. & Ben-Akiva, M. E. (2024). Real-time personalized tolling for managed lanes. Transportation Research Part C: Emerging Technologies, 163, 104629.
[Sm76] Smith, V.L., 1976. Experimental economics: induced value theory. Am. Econo. Rev. 66 (2), 274279.
[Su20] Sun, Jian, Jiyan Wu, Feng Xiao, Ye Tian, and Xiangdong Xu. 2020. Managing bottleneck congestion with incentives. Transportation Research Part B: Methodological 134: 143166. https://doi.org/10.1016/j.trb.2020.01.010
[Sv99] Svensson, L.-G. (1999). Strategy-proof allocation of indivisible goods. Social Choice and Welfare, 16, 557567.
[Ta22] Wang et al. Coordinating followers to reach better equilibria AAAI Conference (2022)
[Uk17] Dixit, Vinayak V., Andreas Ortmann, E. Elisabet Rutström, and Satish V. Ukkusuri. 'Experimental Economics and choice in transportation: Incentives and context.' Transportation Research Part C: Emerging Technologies 77 (2017): 161-184.
[Va21] Vadiveloo M, Guan X, Parker HW, et al. Effect of Personalized Incentives on Dietary Quality of Groceries Purchased: A Randomized Crossover Trial. JAMA Netw Open. 2021;4(2):e2030921. doi:10.1001/jamanetworkopen.2020.30921
[Va19] Hu, Lily, Nicole Immorlica, and Jennifer Wortman Vaughan. 'The disparate effects of strategic manipulation.' Proceedings of the Conference on Fairness, Accountability, and Transparency. 2019.
[Ve10] Ettema, Dick, Jasper Knockaert, and Erik Verhoef. 2010. Using incentives as traffic management tool: empirical results of the peak avoidance experiment. Transportation Letters 2 (1): 3951.
[Vi61] W. Vickery, Counterspectulation, auctions, and competitive sealed tenders, J. Finance, vol. 16, no. 1, pp. 837, 1961.
[WC07] https://cee.mit.edu/ben-akiva-honored-for-lifetime-achievement-in-transportation-research/
[Xu21] Dawkins, Quinlan, Minbiao Han, and Haifeng Xu. 'The limits of optimal pricing in the dark.' Advances in Neural Information Processing Systems 34 (2021): 26649-26660.
[Zi03] Martin A Zinkevich, Avrim Blum, and Tuomas Sandholm. On polynomial-time preference elicitation with value queries. In Proceedings of the 4th ACM Conference on Electronic Commerce, pages 176185, 2003.
[Zi04] Avrim Blum, Jeffrey C. Jackson, Tuomas Sandholm, and Martin Zinkevich. Preference elicitation and query learning. Journal of Machine Learning Research, 5:649667, 2004

Mots clés

Management Science, Game Theory, Applied Mathematics

Description

Compétences requises

Bibliographie

Mots clés

Offre boursier / non financée

Dates

Langues

Divers

Contacts