More selected projects

Congressional Scandal Generator

An interactive installation that generates congressional scandals and pairs them with generated congresspeople in order to criticize the current political structure that is stained with corruption in the Peruvian government.

produced by: Alesandra Miro Quesada                                                                                                                                                                                                

 

Concept and Aim

Latin American politics has had a long history of tragic dictatorships, civil wars, and political coups. Venezuela has been suffering from failed military governments for more than a decade, the Argentinian peso has lost 67% of its value since 2020 and the current Brazilian president is a conservative, ultra-right military man who has made headlines for anti-abortion stances, sexist, racist and homophobic comments. 

As a Peruvian growing up, this was just the way the governments worked. I quickly learned that one way or another, no matter the country, race, or richness, there always lurked a common denominator: corruption. From cocaine sales to fake school certificates; from leaked racist audio recordings to unsolicited nude pictures, the Peruvian Congress is seasoned with the most outrageous scandals that range from the gravely serious to the merely ridiculous.

This project aims to use humor and political commentary to create a disobedient installation that uses Machine Learning to train and create synthetic scandals based on news data from real congressional scandals in Peru as well as congresspeople to go alongside them. 

 

Technical Approach

Project Pipeline:

This disobedient installation consists of two main components: Machine Learning for the generation of visual and text data, and user interaction for the triggering of this data, aka the scandals. It was very easy to get lost in the project, and in fact, I did sometimes. So it was very helpful to create a project pipeline and write directions to remind me of how to continue when lost.  

The Datasets:

Approaching the data collection was a very daunting process. I wanted my installation to not only have written scandals but also images to go with them. This meant I had to collect two datasets and train two Neural Networks (NNs). 

Before I started to train any NNs I needed to contain my data as much as I could to allow the best training possible, so  I set myself a series of rules: First, only take information from the main Peruvian newspapers, magazines, and radio sites (social media articles and any other written news on the internet was out of bounds) Second, the data should only belong to the period from 2000 to 2020. I was also careful to record the sites I visited on Zotero, to be able to trace back the articles if needed. 

I started with the text generation and braced myself as I embarked on a slow and meticulous process of browsing the internet for news articles on corruption scandals in the Peruvian Congress. On and off I spent two days collecting and copying corruption articles online into individual textfiles. In total, I collected 641 articles, roughly between 500 and 1000 words each. 

The approach for the collection of the images was somewhat similar. I made sure that whenever I collected an article I also took screenshots of any images of the corresponding congresspeople in the article. In total, I collected 1089 Images of Peruvian congresspeople, which had to be individually cropped to just the face of the person. This would ensure the Neural Networks accurately detects and generates only faces as it trains. 

Below are the two datasets of the collected and processed newspaper articles as well as the collected images. These are now ready to start the training process.                                                                                                                                                                      

 

                                                                                                                                                                            The Training:

I decided to use Runwayml for the Machine Learning process because of its quick prototyping power, user-friendly interface, and its freedom to allow OSC, HTTP, and Java communication.

For the text generation, I used openai's GPT-2 generative text model. 'GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text' (openai, 2019).  In our case, it was going to predict congressional scandals. 

For the image generation, I used Nvidia's StyleGan-2 generative image model. Another large pre-trained model which uses the deep learning method of Generative Adversarial Network. In short, GANs can learn the specific pixel pattern of almost anything, provided you have enough images of it for it to train. In our case, it was corrupt congresspeople, and I had more than enough. 

The process that followed was one of waiting. It takes time and money to train Neural Networks, much more than what I had anticipated. Runway allows the users to rent GPU or CPU space in their cloud because training on my laptop's CPU was not an option. This put a few restrictions on what I had envisioned for the training, and how realistic I wanted my models to be. I, therefore, set a budget of $50 dollars for 5 to 6 hours of training per model. 

Below I'm going to show some of the progress and results of the training. 

 

I was so impressed with the training results for both the image: StyleGAN-2 and text: GPT-2 systems.  As soon as I started to recognize faces and read somewhat coherent sentences, it was enough for me to be ecstatic!!!

In total, I trained both models until my budget was met, almost 6 hours. It was so interesting to witness how both systems progressed and evolved and even though they need more training (especially the text) this project has opened my eyes to the vast possibilities in Machine Learning. I am very excited to keep on training these models and hopefully find a grant that would be interested in supporting this research. 

Now that I had successfully generated my congresspeople and their scandals, it was time to bring the installation to life and interact with this data. 

 

 

Data organization and Interaction:

I used OpenFrameworks as a tool to store the information generated by the Neural Networks as well as to create the user interaction and manage the flow of data to the user. 

Using an HTTP connection, OpenFramworks would send a string prompt to Runway, which in turn would generate a written scandal. Depending on the number of characters in the text, it took between 15 and 20 seconds for Runway to generate a new 560 character scandal. This meant that the user would have to wait this long each time it wanted to interact with the installation. This would ruin the UX because it would discontinue the thrill and disinterest the audience. Nobody likes waiting.  

This is when my Tutor Theo introduced the idea to create a buffer object that would store the Runway scandals and save them for when they were needed. This meant that I could leave the algorithm generating scandals while there was no interaction and the queue would be slowly be filled. Then, when a new user interacted, the buffer would have scandals stored in it and would trigger them seamlessly.  

For the images, the solution was a bit more tricky. Runway allowed only one model to be running at a time, either the text or the image could be generated live. This meant that I had to choose whether I wanted the text or the image to be generated in real-time whenever a prompt was issued. I chose the generated text because the use of string would allow me to prompt and organize the image more easily than the other way around. 

I used the prompts sent by Runway and categorized them into congresswomen and congressmen. Using compare string allowed me to tell OpenFrameworks when a congressmen prompt is sent to Runway. Subsequently, the same prompt also triggered a congressman image to be stored in the queue. The same process is applied when a congresswoman prompt is issued. Both male and female images were in different image directories in the bin folder, so it was just a question of triggering them correctly and on time. Queues are my new favorite object.

The algorithms read as follows:                                                                                                                                                                                          

                                                                                                                                                                                                                                                                      The organization of this project in terms of HTTP connection, data management, and data classification was definitely one of the most challenging aspects of this project. I really had to think very hard in terms of logic and truly understand what I wanted to achieve. More often than not, I would blindly start coding and soon find out that I was using the wrong object or overcomplicating things. What really helped me, in the end, was to physically write up the steps and algorithm of the project. Once I saw it on paper, it helped me remember where everything was supposed to go and how it was working. 

Initially, I was using the space button on the keyboard to test the progress of my project. This was very useful as a starting point as I could quickly trigger and test the scandals and images while programming. When the functionality was completed I moved onto the real interaction. Initially, I was going to use an Arduino button, to trigger each scandal when pressed, however, when assembling it, the setup looked very underwhelming and unattractive. I, therefore, decided to change it to something more interesting. 

I shifted my approach to triggering by using computer vision color detection. I decided to design and print a yellow button that would be recognized by the camera and consequently trigger a scandal. I used week 12 color detection class examples to get me started. It took me at least a couple of days to implement and again figure out the logic behind how it would work. I quickly found out that it would be harder than I thought but it all came down to if statements and the use of timers to act in unison.

The algorithm reads as follows:                                                                                                                                                                      

 

Code Walkthrough                                                                                             

Future development

Even though I am quite pleased with the outcome of my project,  I definitely want to keep on developing it, into a fully-fledged installation. 

The implementation of English translations for the written scandals is paramount if I want to present my work to a wider audience that is not necessarily Peruvian. I actually attempted to implement a translation algorithm through HTTP but the deadline was approaching so I decided to leave it for further development. Nevertheless, if accomplished, I think it will really make the project that much more professional. 

Finally, I would like to work on the physicality of the installation. Because of the current COVID-19 situation, I was unable to create what I had envisioned as my installation. This is something that I would definitely want to explore because it will make the experience that much more interesting to interact with as well as eye-catching for the audience. 

 

Self-evaluation

This project definitely had his ups and downs as I learned to steer OpenFrameworks in the direction I wanted. This sheer amount of approaches, which at the start was very overwhelming, proved to be a necessary learning curve because it taught me that I really need to work on the logic of my program before I started to code. 

More often than not I would have an idea for a solution, start coding straight away, and halfway through encounter problems. More often than not, these problems would have been avoided had I properly thought through my logic and studied what functions and objects were the right ones to use. I guess this is something that comes with practice and towards the end, I felt like I had really learned something new. 

From the Machine Learning side, I was really unprepared for the amount of curating and data organization that I had to do. I expected the Neural Networks to just seamlessly process any data that I would give it but eventually found out that the results of the training were only as good as the dataset that you provide it. Moreover, it is very hard to create good enough dataset that will aid the machine to learn what you want. This ended up taking me days instead of hours and proved to be an unexpected but rewarding setback. 

Overall I am very happy that I approached a concept that included Machine Learning because it really is a magical way of computing. It has opened my eyes to so many possibilities of using data and I will definitely be exploring more of it.  As for OpenFrameworks, it was a steep learning curve, but I have finally learned to love its great potential and power of adaptation in object-oriented programming. 

References and Documentation:

Addons:

  • https://github.com/runwayml/ofxRunway
    • which needs:
      • https://github.com/bakercp/ofxHTTP
      • https://github.com/bakercp/ofxIO
      • https://github.com/bakercp/ofxSSLManager
      • https://github.com/bakercp/ofxNetworkUtils
      • https://github.com/bakercp/ofxMediaType

 

Runway:

  • https://learn.runwayml.com/#/
  • https://openai.com/blog/better-language-models/
  • https://github.com/runwayml/processing/tree/master/GPT2
  • Models used through Runwayml
    • https://github.com/NVlabs/stylegan2
    • https://github.com/openai/gpt-2
  • https://www.youtube.com/watch?v=ARnf4ilr9Hc
  • https://www.youtube.com/watch?v=7btNir5L8Jc
  • https://heartbeat.fritz.ai/animated-stylegan-image-transitions-with-runwayml-57a2e20db80f

 

OpenFramworks:

  • https://forum.openframeworks.cc/t/randomly-shuffle-order-of-elements-in-a-vecrot-or-array/22914
  • https://forum.openframeworks.cc/t/random-number-generation/15545
  • http://www.cplusplus.com/reference/queue/queue/front/
  • https://openframeworks.cc/documentation/video/ofVideoGrabber/
  • https://forum.openframeworks.cc/t/changing-the-camera-input-while-the-app-is-running/7622
  • http://www.cplusplus.com/reference/string/string/compare/
  • https://openframeworks.cc/documentation/graphics/ofTrueTypeFont/
  • https://www.geeksforgeeks.org/queuepush-and-queuepop-in-cpp-stl/
  • https://www.cosmiclearn.com/cplusplus/stdqueue.php
  • https://en.cppreference.com/w/cpp/algorithm/random_shuffle

 

News articles:

  • Below is a sample of some of the news articles I collected:
    • ‘Apra: Dirigencia de La Libertad Pide La Expulsión Del Congresista Elías Rodríguez | Canal
    • N’.
    • ‘Arequipa’.
    • ‘Arlette Contreras presentó su carta de renuncia del partido Frente Amplio’.
    • LR, ‘Arrestan a candidato del FA condenado por corrupción’.
    • ‘Becerril’.
    • Peru21, ‘Beto Ortiz tildó de “delincuentes y corruptos” a los congresistas’.
    • LR, ‘Candidata a presidir Poder Judicial anuló sentencia de Cecilia Chacón’.
    • ‘Carlos Bruce Califica de Absurda Investigación En Su Contra | Noticias | Agencia Peruana de
    • Noticias Andina’.
    • Peru21, ‘Carlos Bruce considera que denuncia en su contra “es una patraña”’.
    • ‘Carlos Bruce Sobre Investigación de La Fiscalía En Su Contra Por Supuesta Corrupción: “Esa
    • Es Una Patraña” (VIDEO) Política | Correo’.
    • ‘Carlos Bruce: “El Apodo de Mamani Era ‘El Padrino’”’.
    • ‘Carmen Lozada de Gamboa: Mi Esposo Trabajó Para El SIN | Política - La República’.
    • ‘Caso Yovera: Consejo Directivo Se Interrumpió En El Congreso Por Presencia de Heriberto
    • Benítez Política | Correo’.
    • ‘Cecilia Chacón: “Al No Poder Enfrentarnos En Las Urnas, Nos Quieren Disolver Con
    • Absurdos Jurídicos”’.
    • ‘César Acuña fue denunciado en España por plagios en la Complutense’.
    • ‘César Acuña’.
    • LR, ‘Citan a ex congresista Medelius / Subcomisión que evalúa denuncia constitucional por
    • manipul’ .
    • ‘Clemente Flores: En Mi Caso No Hay Nada y Voy a Ir Hasta Las Últimas Consecuencias |
    • America Noticias’.
    • Correo, ‘Clemente Flores’.
    • Perú, ‘Comisión Orellana cita a hijo y a socio de congresista Elías’.
    • ‘Comisión Orellana Cita al Congresista Fujimorista José Elías Congreso | El Comercio Perú’.
    • ‘Comision Permanente Del Congreso de La Republica Informe Final’.
    • ‘Condenados Cuatro Ex Legisladores Peruanos Por Corrupción | Internacional | EL PAÍS’.
    • ‘Condenan a 5 Años de Cárcel al Excongresista Nacionalista Que Asesinó a Perro | America
    • Noticias’ .
    • ‘Condenan a 5 Años de Prisión Efectiva a Congresista de Fuerza Popular Política | Peru21’.
    • ‘Condenan a Siete Años de Prisión a Absalón Vásquez - Perú 21’.
    • LR, ‘Confirman pagos a ex congresista Luis Cáceres Velásquez’.
    • Correo, ‘Congresista Amado Romero fue suspendido por 120 días’.
    • ‘Congresista Aprista Elías Rodríguez Solo Fue Amonestado Por Plagio, Pese a Que Se
    • Recomendó Suspensión Política | Peru21’.
    • ‘Congresista Elías Rodríguez Admite Plagio de Textos Para Proyectos de Ley | RPP Noticias’.
    • ‘Congresista Elías Rodríguez Plagió En Cinco Proyectos de Ley | Canal N’.
    • LR, ‘Congresista ocultó vínculos con Montesinos durante 8 meses Pennano se reunió tres veces
    • con’ .
    • ‘Congresista Yovera Fue Suspendido 120 Días Por Mentir En Su Hoja de Vida’.
    • ‘Congresista Zacarías Lapa es sentenciado a cuatro años y ocho meses de prisión’.
    • ‘Congresistas Cuestionan Que Zeballos Haya Dicho Que Adelanto de Elecciones Es Inamovible
    • Política | Peru21’.
    • ‘Congresistas hicieron un mea culpa tras ser considerados como la institución más corrupta’.
    • ‘Congreso Suspendió Por 90 Días a Javier Diez Canseco | Política | Gestión’.
    • ‘Congreso suspendió por 120 días a María Elena Foronda por contratar a una condenada por
    • terrorismo’ .
    • ‘Congreso Suspendió Por 120 Días a María Elena Foronda Por Contratar a Una Condenada Por Terrorismo | RPP Noticias’.
    • ‘Congreso: Indignación Ante Demanda de Tula Benites Para Que Le Paguen S/.2.5 Millones Por “Daño Moral” Política | Peru21’.
    • Correo, ‘Consejo Directivo del Congreso acuerda desaforar a congresista Alejandro Yovera’.
    • ‘D4 Software Page | Ivica Ico Bukvic’.
    • Peru21, ‘Daniel Salaverry’.
    • Cañedo and Domínguez, De la confrontación a los intentos de ‘Normalización’.
    • ‘Declaran vacancia de congresista Alejandro Yovera’.
    • Peru21, ‘Del Castillo’.
    • ‘Del Solar a Del Castillo: “No Veamos Temas de Procedimiento, La Gente Está Harta de Ese
    • Tipo de Política” Política | Peru21’.
    • ‘Detectan Plagio En Propuesta Legislativa de Yesenia Ponce Política | Correo’.
    • LR, ‘Detienen a excongresista Betty Ananculí por incumplir cuarentena’.
    • Correo, ‘Diez Canseco reconoció que canje de acciones beneficiaría a su familia’.
    • ‘Doka, Tower Hamlets JavaScript Trainer, Tower Hamlets C++ Trainer, Tower Hamlets C