Amazon Echo @rohane

Modernity and Confidentiality: a National Proxy for Home Voice Assistants!


Alexa do this! Google do this! Who hasn’t heard the many and varied ads for the various voice assistants that are now supposed to be as indispensable, if not more so, than our refrigerators and hotplates. Here is an example of this type of advertising.

All this to persuade that such a wonderful object must have a compulsory presence in every home (all over the world according to Amazon). Especially to ask how long it will be (7 seconds of the video) while the window is behind you. Let’s move on …

1) THE MAIN PRINCIPLES OF VOCAL ASSISTANTS

At this point, for those who are not yet familiar with the operation and use of a vocal assistant, let us just remind you in a few words that they are characterized by a case containing a sound speaker and a microphone. The microphone is designed to listen to what the people around the box are saying in order to understand the orders and requests addressed to it. And to reproduce the answer to the request through the voice of the loudspeaker. (See the explanation of how it works on IDAP)

What uses?

Requests can be for music broadcasting as well as questions.

In short, we have a loudspeaker that diffuses sounds and above all a microphone. A microphone that is in constant operation always listening. And that’s the problem. Because in order for the voice assistants to respond to you, they have to listen, constantly, all the time, non-stop, 24 hours a day, 366 days a year for leap years. The purpose of this short enumeration is to make some people understand that the presence of this famous Microphone includes listening and therefore constant monitoring. Of course, the name of the assistant must be said so that he understands that we are talking to him and not to another person and that he must therefore be ready to serve. But in order to do this, to hear and understand that one is speaking to him, the vocal assistant must listen constantly. Always listening.

2) VOICE ASSISTANTS & CONFIDENTIALITY

So here we are with a spy bug that we have, voluntarily, placed, in all ingenuity, within our own walls, at the heart of our homes and our lives. It seems far away the time of the Oscar-winning film “The Lives of Others” [to be seen necessarily in original version of course 🙂 ] where, without wanting to divulge too much, in the former East Germany; the so communist GDR; the main activity, if not the only one, of the government (via its political police, the STASI) consisted in spying on these fellow citizens by placing microphones in their apartments and houses. See it at the 44th second trailer for this microphone operation.

Now we’re doing much better. No more drilling, unscrewing, piercing! We voluntarily place the microphone by ourselves. In the best possible places. We welcome with open arms the modern Trojan Horse. The one that doesn’t invade the city. The one who listens and knows everything about us. For who controls the transmission of the phrases picked up by these microphones? Where do they go? Who listens to them? Who analyses them? Who really knows what is being sent? Just our command sentences given

All this is supposed to be governed by charters and regulations that are supposed to protect our private lives, but whose main features are elasticity and the black hole.

Of course, this mistrust must be moderated by giving the benefit of the doubt or trust commensurate with the respect for confidentiality and privacy demonstrated on a daily basis by today’s companies in the digital world, i.e. an index between zero Kelvin and zero Fahrenheit.

For reporting in detail and exhaustiveness on the multiple scandals or bugs that have already taken place would dry up the oceans of bytes that are available to us on the Internet. Here’s just one link on the Guardian and another on Consumer watchdog.

Especially now that the wizard principle has been extended to devices with screens such as those from the manufacturer LENOVO. But who says screen also says first step for the integration of Camera! Facebook has already taken the first step with its Portal assistant, which includes one! Officially, it is intended for video exchanges. But who really decides when the camera is opened and turned on? Let’s also remember that the founder of Facebook, Mark Zuckerberg, solemnly declared “We don’t want a camera in everyone’s living room“. This does not prevent him from having cameras made that contain them, because his living room is clearly not his customers’ living room…

Here we are in a world even stronger than in the wildest dreams of GDR spy bureaucrats. Microphone and Camera. Seeing and listening all the time. Let’s add to this that his assistants, if they haven’t already done so, will integrate other sensors such as humidity level, air movements, seismic sensitivities (a door slamming or opening) etc. and they will, in addition, couple with other surveillance tools. Like watches that can already give us our pulse or heart rate. And who knows how soon our nervous and other flows … ? So we’ll have a wonderful, constant world of espionage where we’ll know everything about ourselves in a second. And it goes even further. Through the analysis, in real time, of our words by experts or soon (already?) by Artificial Intelligences, we will have no more unknown or secret for anyone.

The machines will know more and better about ourselves …

3) ADVANTAGES & DISADVANTAGES OF VIRTUAL ASSISTANTS

Nevertheless, let’s list some of the advantages of these assistants

  • Possibility to answer questions such as :
    • Overall knowledge
    • News
    • Weather
  • Play music or ambient sounds
  • Helping to make a recipe
  • Providing a new form of gaming
  • Making video exchanges
  • Display Web pages based on voice search keywords

And let’s list, without being able to be exhaustive, the problems and inconveniences generated by these assistants

  • Permanent listening by the Microphone
  • No guarantee as to what, how much, to whom, how often bits and pieces of conversation can be sent.
  • Detection of emotions carried by the sound of the voice
  • Detection of the emotional state following the analysis of the pronounced contents
    • Richness and vocabulary used
    • Speed – slowness of spoken words
    • Placement of tonic accents or spacing in syllables
    • Comparisons to known history
  • Detection of the trust placed in a person or a mark by analysing the way in which the name and associated words are pronounced
  • Possibility of knowing through facial recognition or other objects (watches, telephones, etc.) who is in the room.
  • Movement analysis
    • Who?
    • What time?
    • How fast?
    • What kind of path?
    • What are the differences from other moves?
  • Ability to detect other events outside the field of view such as a door slamming or opening.
  • Analysis of the relationship between room temperature and emotional state
  • Detection of used or consumed objects. Facial recognition will not only be for people, it will also be used to know what we eat, at what time and how!
  • Let’s also add all of the topics discussed above, but with the possibility that some things will go bad
    • Unauthorized personnel of third parties such as subcontractors of device manufacturers access the data.
    • Or hackersers such as grey, black or reverse rainbow hackers also start hijacking data for their own profit.
  • Weird behaviors are starting to emerge…).
  • And probably a lot of other possibilities for monitoring or analysis …

So should we be able to do without the advantages of modernity so that we no longer have the disadvantages? Should we renounce progress or surrender to the great digital powers? There are 2 solutions to get out of this dilemma. As usual, we have both factual and strategic solutions.convénients.

a) THE FACTUAL

Firstly, it can be decided, or even imposed, that listening to these devices is not permanent and that it is controlled and decided by the users.

In short, it is understood that there is NO reason why these devices should not be controllable by their users. It is only the will of the manufacturers to deliver them with this type of operation. And the passivity of each nation’s utilities to let such products into our homes. But as we have seen, this is a factual response. And it is also a limited response. For even if the user controls and limits the listening times and possibilities, nothing will prevent data transmissions, whether or not they are planned by the builder. And nothing will prevent analyses of this data and sales of these analyses to anyone. We have therefore limited by this factual approach, as is often the case, the problem or problems, but we have not solved them. To do that, we have to go to another stage.

b) THE STRATEGIC RESPONSE: THE NATIONAL PROXY

If we want a real control on the collected digital data, we must remain consistent and also use computer tools and especially computer network tools, because this is where the back-and-forth between the assistants and the manufacturers’ servers takes place (see the technical diagram, in English, on the stuffi.fr website). And it is there that the notion of Network Proxy intervenes. Without going into too many details nor technique, which would be doubly useless, we can simplify the concept of the Proxy by the notion of intermediary. Within the framework of a Proxy network, it is an obligatory passage so that a request formulated by a device (station A) arrives well at the targeted device (the Station B). This passage is obligatory. Technically speaking,

A can only talk to B via the intermediary and B can only answer A always via the intermediary. This type of scheme is found in many client-server computer schemas, particularly for consulting web pages, but also for sending computer data in the broad sense. This is for various reasons, including security between machines and the possible acceleration of exchanges through Web-type caches. By simplifying, the installation of such an equipment makes it possible to have an intermediary who will check and validate that what you transmit as information from the assistants present in your house are not confidential, outside the Laws and in conformity with what you wish to transmit. But what also interests us in the context of a deployment of a National Proxy for voice assistants is another of the properties of proxies, confidentiality (some advantages and disadvantages of Proxies).

4) PRIVACY THROUGH THE NATIONAL PROXY

Indeed Proxies offer the possibility, at least theoretically, to “mask” the identity of the person making the request. Identity does not necessarily mean the user’s name and address, but its equivalent in a computer network, i.e. his IP address and his digital footprint (see the Avira@preview article on the subject that preaches for his church).

So in the context of the advantages (necessities?) of setting up, for each country, its own national Proxy for voice assistants, confidentiality comes first. The National Proxy will make it possible to mask the user’s digital identity, in the broadest sense (IP address and digital fingerprint), from the manufacturer of the assistant in order to reduce and control the marketing profile used by the latter. Also the data sent to the manufacturer’s servers via the proxy will be controlled to let pass only sentences – orders – requests in due form with the mandatory inclusion of the manufacturer’s keyword (Alexa, Google, Mark, Skynet…). Anything that is not legitimate will be filtered and will not be sent to the manufacturers’ servers. In the same principle, metadata type information that is collected off mic and camera will be verified and approved before being sent to the manufacturers’ servers. Thus it can be assumed that sending key information such as a wide geolocation and a precise local time is possible. (This will at least allow all those living in bomb shelters to legitimately ask the key question “What’s the weather like?”). For the time being, this Metadata seems to be limited, but as these virtual assistants develop, they will become richer and richer and will need to be better and better managed.

Another possible advantage of using a proxy is the possibility to make computer cache dump very regularly (at each request?) to best protect the confidentiality of the requests.

The computer processing times of transfer by Proxys will be increasingly negligible in the context of the continuous deployments of fibre optics and 5G then 6G and other XGs that will gradually spread across the globe. Again, let’s not just think in terms of years, but in tens, hundreds, or even thousands.

a) NEW COLLECTIVE SERVICES

One could also take advantage of the implementation of such a Proxy to create a National Bank of collected and entirely anonymized data, with clear objectives and methods, collecting and structuring certain types of information that could be useful to the Community. Provided that this is strictly framed and technically possible (see below). New services could thus be set up via this national Proxy where requests would be intercepted and processed solely by the Proxy. (“Alexa calls the Firefighters!” or “I’m having a heart attack”). Knowing, as in a classic phone call, where the request comes from will make people more responsible. And we can even imagine a mini dialogue via the voice assistant as during calls to the emergency services to validate the sending of a team on the spot. Still within this same framework of additional services directly supervised by the Proxy, we could facilitate the request for administrative documents (I certify that I am requesting a birth certificate). My validation code is …). Many other potentials are possible. In a way, on the basis of products coming from private companies, by adding a national service, the Proxy, we can develop new advantages and possibilities for all the members of the Community. There is indeed a possible alliance between private initiatives and COOPERACTIVE ASSETS

b) OBJECTIONS TO DEPLOYMENT

First of all, as a reminder, in order not to make the subject too heavy and to be understandable by the greatest number of people, many pure technical aspects are not described in this article. This is not the point. The point is to have the information that this type of service, the National Filtering and Privacy Proxy, is quite feasible. Especially if, in addition, software and manufacturer-specific constraints are imposed on manufacturers to ensure that their devices comply with current standards.

Another objection could be the cost of deploying such a device. But don’t they say that freedom is priceless! He also sees this device as a means of protection and security. Cars cost a little more with seat belts, but everyone now understands the importance of their presence. It will be the same for the National Proxy.

But the main objection, and a legitimate one, might be to point out that the deployment of such a mechanism would amount to state surveillance. Leaving aside the astonishment that those who are concerned about state surveillance are not revolted and outraged by the same surveillance carried out by private companies, and moreover foreign (at present than US) companies, to propose some necessary answers to this argument. Once it has been said that access to data on their farms would be strictly regulated by laws, norms, rules and also the presence of an independent institution, the spirit that one wishes to implant in the system will have been confirmed. But we will not have made any progress on the evidence and effectiveness of the protection barriers that we wish to put in place.

That’s when you have to go back to the purely technical aspects. Still to remain as understandable as possible, we will not go into the details of implementation, but we will mention at least two protective barrier mechanisms.

Firstly, unlike what can happen with some manufacturers (who really knows?) there will not be a storage of all the elements requested by users. There will be no Data Lake. Nothing requested by a user will be stored. Moreover it will save storage costs 😉 to be moderated with the number of metadata that one can decide to keep over time for beneficial uses for the Community].

Secondly, one can (must?) organise the sending of the sound information transmitted by the assistants in place at the users’ premises in the form of encrypted data which is sent to the Proxy which sends this same encrypted information back to the manufacturers’ servers. And which in return will return to the Proxy new encrypted information that only the device transmitting the request will be able to decode and give a clear answer to the user. And this is regardless of the form of the answer (sound, video, text). At no time will the National Proxy see all the requests made by the users, but we will still have access to parts of the request, if only to verify that the call to the assistant was indeed triggered by the user (Alexa?) and that the metadata collected and transmitted by the assistants respect the privacy standards set.

The implementation of such a mechanism requires encryption key deployments and approvals such as private or public keys, but again, the subject of the article is not the technical description of the implementation of the proxy, but the possibility and necessity of its implementation.

c) POSSIBLE TIMING

Please note that the current assistants are not necessarily intended or compatible with any operating mode. It will be necessary to impose, for each country, a date when no assistant can be sold if it does not comply with the new standards decreed. And another date (because the subject is different) where no old type of voice assistant can be used or operational at customers’ premises [in any case, at the speed manufacturers impose to manage their planned obsolescence, it will only be a matter of months 😉 ].

d) THE ISSUES

They are colossal and decisive.

If the Citizen does not have control over what happens in his own home, if a State does not control or does not know what Private Companies collect as information from its own Citizens then the notions of independence and autonomy for that country and its people are at risk.

And without independence and autonomy, there is no true Freedom possible. In terms of Digital, there are 3 major fields, battlefields between private companies to take over all human activities. – The office. It was the time of the first computers and the domination of Microsoft. – The outside. That’s everything you do between the office and your home. This is the era that really begins with the 2,000’s and the democratization of the mobile phone. It’s the era of Google via Android. Note that a fringe of this field from the outside is now taking place through the connected and autonomous car. – The inside. That’s all you do in your house. Of course, everything interferes a bit. You can use a mobile phone (Google) at home and work with a computer (Microsoft), but seeing the struggles for digital domination through geographical typologies allows a better understanding of the strategic movements underway. Thus the Voice Assistants are in the field of the interior, that of the House. With all the repercussions through the development of home automation. All the objects will be connected. Connected to each other and connected to you. Permanently. With dialogues, exchanges, orders more and more relevant and precise. An example of this development can be found in the page of what Amazon les skills calls. They are in constant progression and it is a commercial struggle between Amazon, Facebook, Google, Apple, Microsoft and soon the Chinese Digital Giants.

After the text and the image, let’s not let Private Companies get their hands on the last Digital battlefield in progress, the vocal one.

In this respect, we are aiming at an exciting time for lovers of great technological and commercial manoeuvres. But we are still living in the old days. There is a lot of effort made, a lot of wasted energy, a lot of wasted time, no collective intelligence and no strategic vision. Fortunately, another reality can be created. And lived.

4) THE ANSWERS

There are two possible levels. – The Law. We saw that this answer was more of a defensive type. And that in addition to having to exercise permanent control (by what technological means?), it does not resolve the fundamental issues of data collection and exploitation.

– The Proxy. This answer which is mainly developed in the heart of the article goes further and starts to turn the tide by controlling what is transmitted to the manufacturers or operators of the Assistants. It is a response that is active and which technically (and it is much more secure than simple decrees or articles of law) controls the behaviour of these same manufacturers. It is a good and constructive approach. But in a temporary way. It is more an intermediate step before the best answer on the substance. Indeed, in this scheme, there is always the possibility of a delay in relation to the developments pushed by the manufacturers. The more the usual little tune of “you’re holding back innovation, we can no longer develop new services, skidoo skate” that will soon be coming.

5. THE FORM OF THE ANSWER

This requires a strategic response in terms of substance, form and duration. This response can only be achieved through COOPERACTIVE ASSETS and one of its tools, Open Source or Free Software (the difference according to Richard STALLMAN) And this even if it is likely that the economic organisation of Open Source must evolve to meet the new challenges it faces, the use of its creations without financial compensation (this is another subject). Indeed, only the most Universal organization possible, in the form of COOPERACTIVE ASSETS, precisely the opposite of national proxies, can allow an optimum development for all of this evolution that are the Voice Assistants or Household.

But you need a system, an organization where: – Services are not restricted in inventions and innovations. – While respecting the confidentiality and privacy of users – While allowing the largest possible number of developers (and not necessarily large private companies and even less oligopolies) to be able to contribute to the improvement of these services – While allowing the largest possible number of independent manufacturers to offer their models or accessories – While allowing users (for those who are interested and motivated) to modify or improve hardware and software – While allowing the community, while respecting confidentiality and anonymity, to gather important and sometimes structuring information for other common goods. – While allowing the creation or development of new services through these assistants such as (see above) the request for administrative paperwork. But one can imagine making appointments or dialogues with “Robots” or “Artificial Intelligence” to advance or solve this or that point. Not to mention interactions with connected objects in the house or other places. Everything is to be imagined and set up for the greater good of all. This deployment in the form of multiple COOPERACTIVE ASSETS that will work together in full cooperation and intelligence will go through the enactment of Technical Standards and Articles of Law, requesting, imposing and controlling the functioning and openness of these assistants.

6) THE BASIS OF THE ANSWER

The first of the emergencies therefore consists, as was done for the Web (which also explains its success), in creating Common Protocols for discussion and exchange between the equipment.

A Common Technical Language must be created between all devices and software related to Voice Assistants and Connected Objects.

A Language that all these devices will have to speak and understand. A Language that will be able to evolve, improve and enrich itself. A Language that any manufacturer will have to use. It will be a mandatory Standard. A Language whose development and security will be financed by a common and minimal participation taken from any sale of hardware or software using the Language (remember that the use of the language is mandatory for all devices and software/applications). There won’t be an Amazon Language, a Google Language, a Xiaomi Language, etc …

There is only one technical language so that everyone communicates with everyone and everyone can understand everyone. When we see the benefits, in all senses of the word, that the Web protocols (here for the Hypertext Transfer Protocol) have brought to the whole of Humanity (and also financial benefits to the shareholders of GAFAM and BATX), we see how much a common language, technical protocols, will allow to improve the quality and richness of life and creative possibilities to the greatest number.

Moreover, with the successful experience of the World Wide Weband the organizations it has generated such as the World Wide Web Consortiumand the Internet architecture Board, we know better how to structure and protect these types of COMMON GOODS.

So it’s up to us to create!


Philippe AGRIPNIDIS

COOPERACTIVE Assets Creator

Cooperactive Assets are a new form of social and economic organization. ALPHARIS is dedicated to the imagination, assistance, design, implementation and dissemination of this new form of economic and social organization for the creation of products, services and inventions


This post is also available in: Français (French)