I am a Data Scientist working under Dr. Brenda Curtis at the National Institute on Drug Abuse (NIDA). Formerly, I was a Senior Data Scientist for the World Well-Being Project at the Positive Psychology Center in the University of Pennsylvania.

I am also a third year PhD student in Computer Science at the University of Pennsylvania working under H. Andrew Schwartz and Lyle Ungar. My primary research interest is community centered NLP: developing methods to measure relationships between individuals and their communities using social media language. I'm also interested in machine learning applications to substance use and recovery and enjoy treating automatic agents as humans.

Finally, I am a father, a first generation / low income (FGLI) student, and a community college graduate. Please reach out if you'd like to talk about similar PhD experiences.


sgiorgi (at) sas (dot) upenn (dot) edu

Publications

Correcting Sociodemographic Selection Biases for Population Prediction from Social Media.
Salvatore Giorgi, Veronica Lynn, Keshav Gupta, Farhan Ahmed, Sandra Matz, Lyle Ungar, and H. Andrew Schwartz. International Conference on Web and Social Media (ICWSM) 2022.
PDF Supplement Bib Data Code
Twitter Corpus of the #BlackLivesMatter Movement And Counter Protests: 2013 to 2021.
Salvatore Giorgi, Sharath Chandra Guntuku, McKenzie Himelein-Wachowiak, Amy Kwarteng, Sy Hwang, Muhammad Rahman, and Brenda Curtis. International Conference on Web and Social Media (ICWSM) 2022.
PDF Bib Data Code
Nonsuicidal Self-Injury and Substance Use Disorders: A Shared Language of Addiction.
Salvatore Giorgi, McKenzie Himelein-Wachowiak, Daniel Habib, Lyle Ungar and Brenda Curtis. Workshop on Computational Linguistics and Clinical Psychology (CLPsych) 2022.
PDF Supplement Bib
A Human-Centered Hierarchical Framework for Dialogue System Construction and Evaluation.
Salvatore Giorgi, Farhan Ahmed, Lyle Ungar, and H. Andrew Schwartz. The Tenth Dialog System Technology Challenge at AAAI (DSTC10) 2022.
PDF Poster Bib
Negative Associations in Word Embeddings Predict anti-Black Bias Across Regions--but only via Name Frequency.
Austin van Loon, Salvatore Giorgi, Johannes Eichstaedt, and Robb Willer. International Conference on Web and Social Media (ICWSM) 2022.
PDF
Getting "clean" from nonsuicidal self-injury: Experiences of addiction on the subreddit r/selfharm.
McKenzie Himelein-Wachowiak, Salvatore Giorgi, Amy Kwarteng, Destiny Schriefer, Chase Smitterberg, Kenna Yadeta, Elise Bragard, Amanda Devoto, Lyle Ungar, and Brenda Curtis. Journal of Behavioral Addictions 2022.
PDF Bib Press Preregistration
Modeling Latent Dimensions of Human Beliefs.
Huy Vu, Salvatore Giorgi, Jeremy D. W. Clifton, Niranjan Balasubramanian, and H. Andrew Schwartz. International Conference on Web and Social Media (ICWSM) 2022.
PDF
Using Facebook language to predict and describe excessive alcohol use.
Rupa Jose, Matthew Matero, Garrick Sherman, Brenda Curtis, Salvatore Giorgi, H. Andrew Schwartz, and Lyle Ungar. Alcoholism: Clinical and Experimental Research (ACER) 2022.
PDF Bib
Regional personality assessment through social media language.
Salvatore Giorgi, Khoa Le Nguyen, Johannes C. Eichstaedt, Margaret L. Kern, David. B. Yaden, Michal Kosinski, Martin E. P. Seligman, Lyle H. Ungar, H. Andrew Schwartz, and Gregory Park. Journal of Personality 2021.
PDF Data Bib
Characterizing Social Spambots by their Human Traits.
Salvatore Giorgi, Lyle H. Ungar, and H. Andrew Schwartz. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
PDF Bib Press Press
Well-Being Depends on Social Comparison: Hierarchical Models of Twitter Language Suggest That Richer Neighbors Make You Less Happy.
Salvatore Giorgi, Sharath Chandra Guntuku, Johannes C. Eichstaedt, Claire Pajot, H. Andrew Schwartz, and Lyle H. Ungar. International Conference on Web and Social Media (ICWSM) 2021.
PDF Supplement Data Bib
Discovering Black Lives Matter Events in the United States: Shared Task 3, CASE 2021.
Salvatore Giorgi, Vanni Zavarella, Hristo Tanev, Nicolas Stefanovitch, Sy Hwang, Hansi Hettiarachchi, Tharindu Ranasinghe, Vivek Kalyan, Paul Tan, Shaun Tan, Martin Andrews, Tiancheng Hu, Niklas Stoehr, Francesco Ignazio Re, Daniel Vegh, Dennis Atzenhofer, Brenda Curtis, and Ali Hürriyetoğlu. Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) 2021.
PDF Data Bib
The emotional and mental health impact of the murder of George Floyd on the US population.
Johannes C. Eichstaedt, Garrick T. Sherman, Salvatore Giorgi, Steven O. Roberts, Megan E. Reynolds, Lyle H. Ungar, and Sharath Chandra Guntuku. Proceedings of the National Academy of Sciences (PNAS) 2021.
PDF Supplement Data Bib
Loneliness and Daily Alcohol Consumption During the COVID-19 Pandemic.
Elise Bragard, Salvatore Giorgi, Paul Juneau, Brenda L. Curtis. Alcohol and Alcoholism 2021.
PDF Bib
Bots and misinformation spread on social media: A mixed scoping review with implications for COVID-19.
McKenzie Himelein-Wachowiak, Salvatore Giorgi, Amanda Devoto, Muhammad Rahman, Lyle Ungar, H. Andrew Schwartz, David H. Epstein, Lorenzo Leggio, and Brenda Curtis. Journal of Medical Internet Research (JMIR) 2021.
PDF Bib
Beyond Beliefs: Multidimensional Aspects of Religion and Spirituality in Language.
David B. Yaden, Salvatore Giorgi, Margaret L. Kern, Alejandro Adler, Lyle H. Ungar, Martin E. P. Seligman, and Johannes C. Eichstaedt. Psychology of Religion and Spirituality 2021.
PDF Bib
COVID-Related Victimization, Racial Bias and Employment and Housing Disruption Increase Mental Health Risk Among U.S. Asian, Black and Latinx Adults.
Celia B. Fisher, Xiangyu Tao, Tingting Liu, Salvatore Giorgi, and Brenda L. Curtis. Frontiers in Public Health 2021.
PDF Bib
Understanding Weekly COVID-19 Concerns through Dynamic Content-Specific LDA Topic Modeling.
Mohammadzaman Zamani, H. Andrew Schwartz, Johannes Eichstaedt, Sharath Chandra Guntuku, Adithya Virinchipuram Ganesan, Sean Clouston and Salvatore Giorgi. NLP+CSS 2020.
PDF Data Bib
Information-seeking vs. sharing: Which explains regional health? An analysis of Google Search and Twitter trends.
Kokil Jaidka, Johannes Eichstaedt, Salvatore Giorgi, H. Andrew Schwartz, and Lyle H. Ungar. Telematics and Informatics 2020.
PDF Bib
Closed- and Open-Vocabulary Approaches to Text Analysis: A Review, Quantitative Comparison, and Recommendations.
Johannes Eichstaedt, Margaret L. Kern, David B. Yaden, H.A. Schwartz, Salvatore Giorgi, Gregory Park, Courtney A. Hagan, Victoria Tobolsky, Laura K. Smith, Anneke Buffone, Jonathan Iwry, Martin E. P. Seligman and Lyle H. Ungar. Psychological Methods 2020.
PDF Bib
Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods.
Kokil Jaidka, Salvatore Giorgi, H. Andrew Schwartz, Margaret L. Kern, Lyle H. Ungar, and Johannes C. Eichstaedt. Proceedings of the National Academy of Sciences 2020.
PDF Supplement Data Bib
Quantifying Community Characteristics of Maternal Mortality.
Rediet Abebe*, Salvatore Giorgi*, Anna Tedijanto, Anneke Buffone, and H. Andrew Schwartz. The Web Conference 2020, IC2S2 2020.
PDF Data Bib
Cultural Differences in Tweeting about Drinking Across the U.S.
Salvatore Giorgi, David B. Yaden, Johannes C. Eichstaedt, Robert D. Ashford, Anneke Buffone, H. Andrew Schwartz and Lyle Ungar. International Journal of Environmental Research and Public Health 2020.
PDF Bib
Exploring Substance Use Tweets of Youth in the United States: Mixed Methods Study
Robin Stevens, Bridgette Brawner, Elissa Kranzler, Salvatore Giorgi, Elizabeth Lazarus, Maramawit Abera, Sarah Huang and Lyle Ungar. JMIR Public Health and Surveillance 2020.
PDF Bib
Digital recovery networks: Characterizing user participation, engagement, and outcomes of a novel recovery social network smartphone application.
Robert D. Ashford, Salvatore Giorgi, Beau Mann, Chris Pesce, Lon Sherritt, Lyle Ungar and Brenda Curtis. Journal of Substance Abuse Treatment 2019.
PDF Bib
Tweet Classification without the Tweet: An Empirical Examination of User versus Document Attributes.
Veronica Lynn, Salvatore Giorgi, Niranjan Balasubramanian and H. Andrew Schwartz. NLP+CSS 2019.
PDF Poster Bib
Suicide Risk Assessment with Multi-level Dual-Context Language and BERT.
Matthew Matero, Akash Idnani, Youngseo Son, Salvatore Giorgi, Huy Vu, Mohammad Zamani, Parth Limbachiya, Sharath Chandra Guntuku and H. Andrew Schwartz. CLPsych 2019.
PDF Code Poster Bib
The Remarkable Benefit of User-Level Aggregation for Lexical-based Population-Level Predictions.
Salvatore Giorgi, Daniel Preotiuc-Pietro, Anneke Buffone, Daniel Rieman, Lyle H. Ungar and H. Andrew Schwartz. EMNLP 2018.
PDF Supplement Data Poster Bib
Residualized Factor Adaptation for Community Social Media Prediction Tasks.
Mohammadzaman Zamani, H. Andrew Schwartz, Veronica Lynn, Salvatore Giorgi and Niranjan Balasubramanian. EMNLP 2018.
PDF Data Bib
Primal World Beliefs.
Jeremy Clifton, Joshua D. Baker, Crystal L. Park, David B. Yaden, Alicia Clifton, Paolo Terni, Jessica L. Miller, Guang Zeng, Salvatore Giorgi, H. Andrew Schwartz and Martin E. P. Seligman. Psychological Assessment 2018.
PDF Supplement Bib
Current and Future Psychological Health Prediction using Language and Socio-Demographics of Children for the CLPysch 2018 Shared Task.
Sharath Chandra Guntuku, Salvatore Giorgi and Lyle H. Ungar. CLPSYCH 2018.
PDF Bib
More Evidence that Twitter Language Predicts Heart Disease: A Response and Replication.
Johannes Eichstaedt, H. Andrew Schwartz, Salvatore Giorgi, Margaret L. Kern, Gregory Park , Maarten Sap, Darwin R. Labarthe, Emily E. Larson, Martin Seligman, and Lyle H. Ungar. PsyArXiv 2018.
PDF Data Bib
Can Twitter be used to predict county excessive alcohol consumption rates?
Brenda Curtis*, Salvatore Giorgi*, Anneke E. K. Buffone, Lyle H. Ungar, Robert D. Ashford, Jessie Hemmons, Dan Summers, Casey Hamilton, H. Andrew Schwartz. PLOSONE 2018.
PDF Data Bib
Modeling and Visualizing Locus of Control with Facebook Language.
Kokil Jaidka, Anneke Buffone, Salvatore Giorgi, Johannes Eichstaedt, Masoud Rouhizadeh, and Lyle Ungar. Proceedings of the International AAAI Conference on Web and Social Media 2018.
PDF Bib
DLATK: Differential Language Analysis ToolKit.
H. Andrew Schwartz, Salvatore Giorgi, Maarten Sap, Patrick Crutchley, Johannes C. Eichstaedt, and Lyle Ungar. EMNLP 2017.
PDF Code Poster Bib
On the Distribution of Lexical Features at Multiple Levels of Analysis.
Fatemeh Almodaresi, Lyle Ungar, Vivek Kulkarni, M. Zakeri, Salvatore Giorgi. and H. Andrew Schwartz. ACL 2017.
PDF Bib
Recognizing Pathogenic Empathy in Social Media.
Muhammad Abdul-Mageed, Anneke Buffone, Hao Peng, Salvatore Giorgi, Johannes Eichstaedt and Lyle Ungar. ICWSM 2017.
PDF Bib
Does well-being translate on Twitter? A comparative evaluation of English and Spanish well-being lexica.
Laura Smith, Salvatore Giorgi, Rishi Solanki, Johannes Eichstaedt, H. Andrew, Schwartz, Muhammad Abdul-Mageed, Anneke Buffone and Lyle Ungar. EMNLP 2016.
PDF Data Poster Bib
Real men don't say "cute": Using automatic language analysis to isolate inaccurate aspects of stereotypes.
Jordan Carpenter, Daniel Preotiuc-Pietro, Lucie Flekova, Salvatore Giorgi, Courtney Hagan, Margaret Kern, Anneke Buffone, Lyle Ungar and Martin Seligman. SPSS 2016.
PDF Supplement Bib
Studying the Dark Triad of Personality using Twitter Behavior.
Daniel Preotiuc-Pietro, Jordan Carpenter, Salvatore Giorgi and Lyle Ungar. CIKM 2016.
PDF Bib
Analyzing Biases in Human Perception of User Age and Gender from Text.
Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle Ungar, and Daniel Preotiuc-Pietro. ACL 2016.
PDF Poster Bib
Analyzing crowdsourced assessment of user traits through Twitter posts.
Lucie Flekova, Daniel Preotiuc-Pietro, Jordan Carpenter, Salvatore Giorgi, and Lyle Ungar. HCOMP 2015.
PDF Supplement Poster Bib
Design and Evaluation of a Web-based Virtual Open Laboratory Teaching Assistant (VOLTA) for Circuits Laboratory
Firdous Saleheen, Salvatore Giorgi, Zachary Smith, Joseph Picone and Chang-Hee Won. ASEE Annual Conference and Exposition 2015.
PDF Bib
Adaptive Neural Replication and Resilient Control Despite Malicious Attacks.
Salvatore Giorgi, Firdous Saleheen, Frank Ferrese and Chang-Hee Won. 5th International Symposium on Resilient Control Systems 2012.
PDF Bib


* equal contribution

Data

Black Lives Matter Twitter Corpus
A data set of 41.8 million tweets from 10 million users which contain one of the following keywords: BlackLivesMatter, AllLivesMatter and BlueLivesMatter.
County Tweet Lexical Bank
County level word and topic loading derived from a 10% Twitter sample from 2009-2015. Anonymized linguistic features extracted from over 1.5 billion English U.S County mapped tweets.

Software

Differential Language Analysis ToolKit (DLATK)
DLATK is an end to end human text analysis package, specifically suited for social media and social scientific applications. It is written in Python 3 and developed by the World Well-Being Project at the University of Pennsylvania and Stony Brook University.
Text
R package for transforming text to state-of-the-art word embeddings that are ready to be used for downstream tasks.
TwitterMySQL
TwitterMySQL is a Python library developed by the World Well-Being Project to pull tweets from the Twitter API and insert them into MySQL.
reddit-crawler-mysql
Reddit crawler with MySQL backend
flask-twitter-predictions
Flask app for running age, gender and (fake) personality predictions from Twitter data.
Map of Twitter Hashtags in Pennsylvania
Community structure of Twitter in Pennsylvania based on users' hashtag use.



Teaching

ENGR 1101: Intro to Engineering
The purpose of ENGR 1101 is to provide you with an understanding of the study and practice associated with civil, electrical and mechanical engineering and technology disciplines. For the electrical section, you will learn several key concepts such as programming in C/C++, hardware design using breadboards and electrical components, interfacing software with hardware using microcontrollers (Arduino), and an introduction to circuit analysis. The last part of this course will involve the Hovercraft design in which you will learn the techniques of soldering and how to efficiently design a prototype in which it is fully functional using an iPad application as the remote controller.

ENGR 4296: Senior Design Project II
Team-oriented engineering system design problems of various types. Topics proposed and orally presented by students in the initial stage of the course sequence. At completion, the project is demonstrated during an oral presentation and a final written report.

ECE 3613: Microprocessor Systems Laboratory
This course provides hands-on experience in assembly language programming for Intel i186EX 16-bit microprocessor and its hardware system implementation. The laboratory assignments utilize 80X86 microprocessor simulations using Emu8086 (www.emu8086.com) and hardware experiments with the FlashLite186 microcomputer by JK Microsystems (www.jkmicro.com) with processor bus logic and output signal measurements using the TechTools DigiView logic analyzer.