Can today's AI override its own rules?

Navigation and service

How is it specifically in the field of work and social affairs? Is there any interest in the changes that AI will bring with it?

ZWEIG: Works councils have been dealing with AI systems for a very long time, certainly for over five years. We do a lot with them, and we're about to set up a workshop system with which we train people on a regular basis. I have also given lectures in employment agencies, and we have already done a lot for consumer protection. I think there are people everywhere who notice: We should look into it sometime.

In your opinion, are there any positive examples of how AI is already being used by administrations that should be looked at more closely?

ZWEIG: Overall, this is a difficult area, but it cannot be ruled out that such systems could support offices. For this purpose, the correctness and effectiveness of the application of these systems must be scientifically observed and evaluated. In addition, the process in which the system is to be used must also be carefully prepared. In this sense, there is a positive example of how you can set up a pilot project when a state uses AI - and that is the Public Employment Service (AMS) algorithm in Austria. It's actually just a heuristic, but let's stick with the more common term “algorithm”. The system classifies the unemployed into three groups: a first group who will get back into work anyway. A third grade that has not been in progress for a long time and may not be in progress either. And a second class in between for everyone who is not in one of these two groups. This group is to be increasingly supported with further training measures. It is obvious that this is a very sensitive task and therefore needs to be evaluated well.

How exactly does that work technically? How does the AI ​​sort applicants?

ZWEIG: The heuristic that was used for this is a so-called logistic regression, which is a very, very simple form of machine learning. In contrast to other machine learning methods, the result is still very clear, as a human being can understand how which factor affects the result.

What insights could you gain there?

ZWEIG: It turned out that there is a kind of penalty if you are a woman, if you are over 50 or if you care for people. This resulted in a great outcry in the media that this software was discriminatory. But that's not entirely true - because the heuristic used has learned from the labor market.

The software basically reveals discrimination, but it doesn't discriminate itself?

ZWEIG: Yes, it reveals the discrimination. Is she discriminating against herself? Well, the software initially has no participation at all, it “does” nothing in the sense of acting independently. But depending on the use of the software, the detected discrimination may now be passed on. According to the head of the AMS, the expected effect is that people who have been discriminated against by the labor market are now increasingly placed in the middle category by the algorithm - and are thus given particularly strong support. That would then be downright anti-discrimination, a kind of compensation movement.

What can we learn from this Austrian example?

ZWEIG: In my opinion, there is only one way to find out whether such systems are helpful or not: We have to scientifically support them from the start and assess in the overall system whether the performance is really getting better or not. But that often fails because we don't even know how good people's decisions were beforehand. Often - for example in HR - the feeling prevails that people are not making good enough decisions. And then something should be done on this basis.

What's the problem with that?

ZWEIG: That results in actionism: A system is bought that is supposedly good and that has often already been trained on external data. I have a good example from the field of medicine: An AI system was tried out in many German clinics in pilot projects, which was supposed to support cancer diagnoses - and it actually did quite well. But the system also made strange suggestions. One reason for this could be that the system was trained in the United States, where doctors are financially involved in the drugs they prescribe. The computer learned this, of course, and in some cases decided on special medication. What I mean by that: You can't just train these systems anywhere and just go shopping.

Is it good that you can't just buy some pre-trained AI system?

ZWEIG: Yes, it's comforting for me too. I am often asked whether we are already completely left behind in Europe because the USA and China are so much further ahead. But all these examples show again and again: If you want to have an AI system for Europe, you have to train it on European data. Europe is an important market - but only together! And that means that we have a position of power here: If we decide that our data will be handled differently, then there are few alternatives for those who want to develop systems for Europe that affect the behavior of Europeans. My most important requirement: Our digital behavior data should no longer be allowed to be collected centrally in order to learn from them. Decentralized, machine learning processes would then have to be used here. So far there is a lack of infrastructure and further research.

In order for market power to function, however, AI systems must be monitored - for which there are currently no central offices. Who do you think should do this control of AI systems?

ZWEIG: I believe that we already have an arbitration organization in most social processes: when it comes to work, the works councils as a resource, when it comes to consumers, we have the consumer advocates, when it comes to private media, the state media authorities. But these bodies would of course have to be equipped with the appropriate competence.

Many of the questions in the field of labor and social affairs fall into the sensitive area you mentioned from the perspective of the employees - from recruitment to bonuses, assessments and automatic sorting mechanisms, as they have already been tested by a large American online retailer. Is it realistic to develop such competencies on a decentralized basis?

ZWEIG: That is absolutely realistic. The works councils have been thinking about this for years. And in fact it is like this: We do quite a lot of workshops, and to be honest, you can get a certain grip on the subject of AI systems in 45 minutes. That is why I am relatively optimistic that the basic understanding of how machine learning processes in particular work, what they can and cannot do, can be quickly and effectively expanded.

And then should the works council take a look at whether the AI ​​discriminates? And whether the data has been processed correctly?

ZWEIG: No, the works council itself cannot do that. Of course, this requires experts. But they also come when the market is there and there are offers.

What would these actors need in order to actually be able to control artificial intelligence systems?

ZWEIG: On the one hand, they need the specialists mentioned. Above all, however, depending on how high the damage potential of such a system is, they need access to data and interfaces in order to be able to understand what exactly is happening and whether, for example, there is discrimination. That is why we have made a regulation proposal that divides various transparency and traceability obligations based on the potential for damage and the degree of dependence on a decision (Figure 1).

How could that look in practice?

ZWEIG: The colleague Wagner-Pinter in Austria, who developed the AMS algorithm, offers his software e.g. B. together with a set of so-called social compatibility rules. Rules on how the system should be used in practice.

What is in these regulations?

ZWEIG: For example: A decision about which category a person ends up in must always be discussed with the job seeker. This person can object. She can view and change her basic data at any time. This means that if the machine has made a decision based on incorrect data, then the decision can be overwritten. If it is overwritten, it must be documented why it was overwritten. And: The system is set up anew every year, and only on the data of the last four years. Job seekers have a right to be forgotten: poor employability during a period of adolescent defiance does not have to accompany you into old age. We would now demand further technical access options that allow lawyers to recognize systematic inequality of treatment - of course in cooperation with experts.

What speaks against an AI TÜV as an actor that carries out the control?

ZWEIG: It's not the system itself that needs to be checked. That's only part of the review. We don't have the one office that simultaneously assesses whether doctors have made mistakes or whether lawyers are working correctly - there are institutions that you can turn to if you have the feeling that the one or the other profession makes systematic mistakes in decisions. We also always need an approach that considers the overall process, as the example with the AMS algorithm has shown. In my opinion, we need an approach that, instead of sealing the software, accompanies and certifies the standardization of the quality of this overall process. This would also have the advantage that you no longer have to certify different versions of a software. Instead, you have to certify the quality assurance of the entire process - and as long as it is constantly evaluated based on certain criteria and depending on the damage potential, the company is then allowed to work. But we also need an independent institution when it comes to state AI.

In your opinion, could the field of labor and social affairs in administration be a test field for taking advantage of the opportunities offered by AI?

ZWEIG: The work area is difficult. The work area in particular is one where I am unsure whether today's AI systems are complex enough to take into account the context dependency that we would actually like to have. The use of AI systems is often presented today as having no alternative. But of course there are alternatives, e.g. B. Use better and more advisors. So yes: On the one hand, it would be an interesting field because we could learn a lot about how human decisions can be better supported. This is because the computer forces us to define more clearly: What does success actually mean for us? What if we get more people to work? How do we want to measure ourselves afterwards? I believe that this process in itself could do a lot of good. But whether in the end we will let machines judge people in these sensitive areas - we have to discuss that widely!

This interview comes from a publication by the Federal Ministry of Labor and Social Affairs on the occasion of the German EU Council Presidency from July to December 2020. The accompanying volume provides scientific articles, interviews, points of view and infographics about the main topics of the BMAS during the German Presidency of the Council of the European Union . In this way, the BMAS would like to strengthen the dialogue within the EU and, together with the European labor and social affairs ministers, identify EU-wide need for action. You can find the complete, digital accompanying volume here: