Fetching and Analyzing Mail from Gmail using Python

Shantanu Patra
4 min readOct 1, 2024

Email communication is a key part of our daily digital life, and accessing email programmatically can be useful for automating tasks such as monitoring messages, extracting information, and performing data analysis. Gmail, one of the most popular email services, provides ways to access its inbox programmatically via APIs or protocols such as IMAP.

In this article, we will explore how to fetch emails from Gmail using Python and analyze their content. We’ll use the IMAP protocol to retrieve emails and the Python imaplib and email libraries to parse them. Finally, we’ll look at some analysis techniques to extract meaningful information from emails.

Prerequisites

Before starting, you will need:

  • A Gmail account.
  • Basic knowledge of Python.
  • Installed libraries: imaplib, email, and nltk (for analysis).

You can install nltk using pip:

pip install nltk

Step 1: Setting Up Gmail for IMAP Access

To interact with Gmail via IMAP, we must enable IMAP access in our Gmail settings. Follow these steps:

  1. Open Gmail.
  2. Go to Settings > See all settings.
  3. Navigate to the Forwarding and POP/IMAP tab.
  4. In the IMAP Access section, select Enable IMAP.
  5. Save changes.

Additionally, if you’re using regular login (and not OAuth2), Gmail may block sign-in attempts by less secure apps. You can enable Less Secure App Access or, more securely, generate an App Password in your Google account.

Step 2: Fetching Emails with Python

Now, let’s dive into the code. The Python imaplib library allows us to connect to an IMAP server and retrieve messages. Here's an example of how to log into Gmail and fetch emails.

import imaplib
import email
from email.header import decode_header
import webbrowser
import os

# Connect to the Gmail IMAP server
imap_server = "imap.gmail.com"
username = "your-email@gmail.com"
password = "your-app-password" # Use App Password if 2FA is enabled

# Create an IMAP4 class with SSL
mail = imaplib.IMAP4_SSL(imap_server)

# Log in to the server
mail.login(username, password)

# Select the mailbox you want to search, in this case, the inbox
mail.select("inbox")

# Search for specific emails (in this case, all emails)
status, messages = mail.search(None, "ALL")

# Fetch the list of email IDs
email_ids = messages[0].split()

# Fetch the latest email
for email_id in email_ids[-1:]:
status, msg_data = mail.fetch(email_id, "(RFC822)")

for response_part in msg_data:
if isinstance(response_part, tuple):
# Parse the message into an email object
msg = email.message_from_bytes(response_part[1])

# Decode the email subject
subject, encoding = decode_header(msg["Subject"])[0]
if isinstance(subject, bytes):
# If it's a bytes type, decode to str
subject = subject.decode(encoding if encoding else "utf-8")
print("Subject:", subject)

# Decode the sender's email address
from_ = msg.get("From")
print("From:", from_)

# If the email message is multipart
if msg.is_multipart():
for part in msg.walk():
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))

if "attachment" not in content_disposition:
# Get the email body
if content_type == "text/plain":
body = part.get_payload(decode=True)
print("Body:", body.decode())
else:
# The email body is not multipart
body = msg.get_payload(decode=True)
print("Body:", body.decode())

# Logout from the server
mail.logout()

Step 3: Understanding the Code

Connecting to Gmail

We start by connecting to Gmail’s IMAP server using imaplib.IMAP4_SSL. The login() method takes your Gmail credentials (email and password). You should use an App Password if you have two-factor authentication (2FA) enabled for security purposes.

Fetching Emails

We use the select() function to choose the mailbox (e.g., Inbox) and the search() function to retrieve specific emails. In this example, we search for all emails ("ALL").

We then iterate over the fetched email IDs and use the fetch() method to retrieve the raw email data. This data is parsed using the email.message_from_bytes() function to convert it into an email object that’s easier to work with.

Extracting Subject and Body

The email’s subject and sender are decoded using the decode_header() function from the email library. The message body is extracted depending on whether the email is multipart or not.

Step 4: Analyzing Emails

Now that we’ve fetched and parsed emails, let’s analyze the content. We can use the nltk library to perform natural language processing (NLP) on the email bodies, such as keyword extraction or sentiment analysis.

Example: Extracting Keywords

To extract keywords, we’ll tokenize the email body and filter out stop words. First, install and download the necessary nltk packages:

pip install nltk
python -m nltk.downloader stopwords

Here’s a simple example that extracts keywords from the email body:

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

# Sample email body for analysis
email_body = "Welcome to your Gmail account. This is your first email message."

# Tokenize the email body
words = word_tokenize(email_body)

# Filter out stopwords
stop_words = set(stopwords.words("english"))
keywords = [word for word in words if word.lower() not in stop_words]

print("Keywords:", keywords)

Example: Sentiment Analysis

You can also use nltk or more advanced libraries such as textblob for sentiment analysis:

from textblob import TextBlob

# Sample email body
email_body = "I am very happy with your service."

# Perform sentiment analysis
blob = TextBlob(email_body)
sentiment = blob.sentiment

print("Sentiment:", sentiment)

Step 5: Conclusion

Using Python to fetch and analyze emails from Gmail can open up a range of possibilities for automation and insight extraction. We leveraged the imaplib and email libraries to interact with Gmail and retrieve email content. From there, we applied basic NLP techniques using nltk and textblob to analyze email text.

You can extend this approach to perform more sophisticated tasks like extracting specific data from emails, generating reports, or integrating with other systems. Make sure to adhere to security best practices when dealing with email credentials and sensitive data.

--

--

Shantanu Patra
Shantanu Patra

Written by Shantanu Patra

Experienced Android developer with 10+ years in mobile app development, specializing in designing, building, and optimizing high-performance Android application

Responses (1)