What is the impact of Data protection Bill 2023 of Government of India on Machine learning application development?
This article delves into the challenges associated with the development of customer-centric machine learning applications due to the implementation of the Data Protection Bill 2023. The passage of this bill by the Government of India (GOI) is a significant step towards safeguarding clients’ personal data, which was much needed given the rampant mishandling and misuse of such data by companies and government bodies. However, the unintended consequence of its impact on the progress of machine learning applications cannot be ignored.
Functioning of machine learning applications requires delving into how these applications work. At the core of a machine learning application is the model, which is built using historical data. The effectiveness of the model is heavily reliant on both the quantity and quality of the data used to create it. While it is possible to create a model using synthetic data, creating the model with actual data is preferable for better functioning. The model must also be consistently trained with additional data to ensure its relevance.
Although the data used to create the model cannot be reverse-engineered to extract the original data, it cannot be disregarded that the model’s foundation lies in consumer data. The Data Protection Bill introduces two key rules that impact the process of model creation for machine learning applications. For simplification, let’s discuss this using the example of an insurance intermediary.
1. Limit the use of personal data to its intended purpose:
Under this rule, insurance intermediaries are confined to using collected personal data only for the purpose it was initially gathered. For instance, data collected to provide an insurance quote cannot be repurposed for any other objective, including the development of machine learning models. This presents a challenge wherein intermediaries would need consent from their customers’ customers to use their data for building models. Navigating this challenge for both existing and future models is a substantial hurdle, requiring careful scrutiny of the bill’s directives by legal teams. Considering that intermediaries are not using their clients’ (Agent) personal data but rather their agent’s client personal data adds a new layer of complexity. Obtaining consent across two nested levels while ensuring transparency as mandated by the bill presents a considerable challenge.
2. Delete personal data when no longer needed:
Insurance intermediaries bear the responsibility of eliminating personal data that has fulfilled its designated purpose or has become obsolete due to irrelevance, inaccuracy, or lack of necessity. Furthermore, in the event of a client’s data deletion request, intermediaries are obligated to comply. This raises the question of whether intermediaries need to revise their machine learning models by excluding the data that was originally used to create them but is no longer consented to.
If this is indeed the case, it introduces a significant shift in the dynamics of constructing and maintaining machine learning models. For instance, if an intermediary utilized my data, obtained with my consent, to develop a machine learning model, they must erase my data upon the expiration of my policy, my cessation as a client, or my request for data deletion. Should this necessitate the model’s recreation by excluding my data, such a task would demand substantial computational resources and a revised approach to data processing, model development, testing, and deployment, potentially requiring frequent periodic updates.
Ultimately, the delicate balance between data protection and the advancement of machine learning applications requires careful consideration. The path forward involves proactive adaptation to comply with the bill’s regulations while sustaining innovation in the ever-evolving landscape of machine learning.
Subscribe To Our Free Newsletter |