How storytelling is a tool that Asians can’t forget to use

심상봉 (1935–2020) Photo illustration courtesy of the author

I am genuinely curious — here’s a question for you, your Asian American friends, or any friend from an immigrant family:

How often do your parents tell you stories of adversity or proud moments in your family’s history?

Or if your parents are first generation, how many stories have you heard about what it was like growing up in Asia? Or even what your grandparents did for work?

Image by Author.

It is hard to fail, but it is worse never to have tried to succeed. — Theodore Roosevelt, 1899

Get on the #StruggleBus.

The Difficulty of Securing Internships During COVID

The impacts of the Coronavirus are here to stay. As a student, you’re most likely learning through your laptop screen, dealing with long-winded take-home exams, and feeling exhausted from an internship search in a market with fewer openings than ever.

Image from C-SPAN, layered with a crowd estimate heatmap.

Although experts have said that without any aerial photos of the riot, it is difficult to estimate the size of the crowd, the one estimate that everyone can agree on is that there were a few police when compared to the number of rioters.

Using Streamlit and CSRNet, developed by Yuhong Li, Xiaofan Z, and Deming Chen, let’s put together a tool to evaluate the number of protesters in any given photo. This tool can be useful to help create descriptive analyses comparing the ratio of cops to the crowd for this event as well as past gatherings (e.g. …

Image by Author

How far will COVID travel this holiday season? It’s a legitimate question and it has experts worried. The TSA reported that over a million travelers passed through TSA checkpoints on December 28th and although this is only 50% of the total travelers a year ago, it’s still 100% more potential carriers of the virus than a year ago.

While the TSA, along with the airlines, face significant hurdles in ensuring the safety of travelers, local and state governments will undoubtedly face similar or greater hurdles if COVID numbers tick upwards during the holiday season.

Image from vintagelegobuilder, gfycat.

Throughout the pandemic, you’ve probably been asked by family or friends: “How can I pick up coding?” Or you yourself have asked, “I should learn to code. How do you actually get started?”

The short answer is projects and mindset. Most coders, formally educated or self-taught, will tell you that while it’s critical to also know the terms and concepts — which can be picked up from online reading, classes, YouTube — the working knowledge they use on the day-to-day came when they dived into a project.

How the “Learn-to-Code” Market Child-Proofs Your Path (Badly)

Today’s economy requires well-rounded data scientists who can move effortlessly between business, research, and data science mindsets.

Image by author. US car accidents, Kaggle.

There’s a classic project that I hear data science job applicants talk about, you probably know it too: Use machine learning to predict traffic accidents in London. During interviews, applicants bring me step-by-step through the project — from cleaning the data to running the machine learning algorithms that their professors had suggested.

My problem is not the topic of the project, it’s the way it’s taught. To conclude their meticulous step-by-step recitation, many students proudly exclaim the accuracy levels they achieved. I ask, “So… where were the accidents though? Anything interesting?” …

What to do when Twitter updates its front end

A man is using a tool to scrap away at a plank of wood. The Twitter logo bird is Photoshopped as being stuck in the wood.
Image from Ivan Samkov on Pexels

Edit April 9, 2021:

Hey! Update here. At this point, I don’t see a way forward with libraries like Taspinar’s Twitterscraper repo which scrape up a whole lot of tweets quickly due to Twitter’s changes in 2020.

This article walks readers through Selenium and how to implement it with Twitter. That being said, if you’re trying to get some data asap and don’t need to learn about Selenium atm, @Altimis has a repo that wraps up Selenium nicely for Twitter. I’ll post a script at the end of this article to use his repo.

Whenever Twitter updates its front end…

And How Open Source Cultures Can Change That

By Theo Goetemann & Hyunjae Cho

“To put it another way, it is easier for a beginner data scientist to find and use open source libraries to analyze text-based social media buzz around Parasite, a Korean movie, in English than it is to analyze it in Korean.”

Parasite wasn’t just a commercial success, it was a comedic and linguistic challenge. Movie critics say that Parasite could not have been a critical hit in America if it weren’t for the quality of the subtitles that reflect the semantics and cultures on this side of the Pacific.

Image from Wikipedia

Darcy Paquet, an American movie…

Image by Author

The news is saturated with the spread of COVID through the White House administration so let’s dive immediately into the code and visualization of the graphic below. Dataset here. Code here.


Using the New York Times’ article, Tracking the White House Coronavirus Outbreak, we can create a quick excel sheet listing all the individuals who were with President Trump prior to the announcement of his positive COVID test.

Image by Author

The first debate was a mess. But like most news today, it will undoubtedly fade as the next story comes out (e.g. Trump testing positive for COVID). Therefore, let’s use some data science and tools to analyze and visualize the debate as quickly as possible before it fades to the background!

Unlike many other data science-oriented articles out there, I’ll be focusing more on quick and dirty ways of data processing, analysis and visualization because — in full transparency — while the visuals above and the ones below may be fun to look at and serve an eye-catching purpose, they…

Theo Goetemann

Theo Goetemann. Founder @Basil Labs. #AI #OpenData #ConsumerIntelligence

