Important
This assignment must be completed individually. The solution submitted for grading must be yours. Please refer to the Academic Integrity section of our class syllabus.
Overview
As discussed in class, the modern email delivery system is typically organized into the following roles:
- Mail User Agent: a user application for composing/editing messages
- Mail Submission Agent: a system program that receives messages from an MUA and push the messages to an MTA for further processing by the mail delivery network
- Mail Transfer Agent: a system program that coordinates with other MTAs to relay messages
- Mail Delivery Agent: a system program that places incoming messages to appropriate user mailboxes on a filesystem.
A typical real-world email setup looks like the following:
┌──────────────┐ ┌───────┐ ┌────────┐ │ Email Client │ <===> │ MSA │ <===> │ MTA(s) │ └──────────────┘ └───────┘ └────────┘
In practice, multiple roles can be implemented in a single mail program. For instance, the last node of the MTA chain typically also run as an MDA.
In this programming exercise, you will be implementing a rudimentary MSA that communicates only with a Mail User Agent. Messages received by your MSA will not be injected into the actual mail delivery network. Essentially, incoming messages will disappear into /dev/null
. Hence, the title of this programming assignment, (NUSA: NUll Submission Agent), which also means "archipelago" in Bahasa Indonesia.
The email setup for this assignment is much simpler:
┌──────────────┐ ┌───────┐ ┌───────────┐ │ Email Client │ <===> │ MSA │ <===> │ /dev/null │ └──────────────┘ └───────┘ └───────────┘
Interaction Flow
Assume we have the following setup:
- An email client (MUA) executed by a user who has email account on
foo.net
- An email server (running as an MSA) at
nusa.foo.net
for users under thefoo.net
domain
Suppose me@foo.net
is composing the following short message for you@mail.app
:
Subject: Breaking News
Have you heard about NUSA? An email server which will never
fill up your mailbox?
The sequence of interactions below shows what commands sent by each party (C: indicates messages sent by client, S: by server) and conforms to the sequencing overview described in Section 4.3.2 of RFC5321:
C: (Initiate connection)
S: (Accept connection request)
S: 220 nusa.foo.net
C: EHLO nusa.foo.net
S: 502 OK
C: HELO nusa.foo.net
S: 250 OK
C: MAIL FROM:<me@foo.net>
S: 250 OK
C: RCPT TO:<you@mail.app>
S: 250 OK
C: DATA
S: 354 OK
C: Message-ID: 7123-dd-fc62
Date: Thu, 16 May 2024 10:22:37 -0500
To: you@mail.app
From: Mason Engelberg <me@foo.net>
Subject: Breaking News
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Have you heard about NUSA? An email server which will never
fill up your mailbox?
.
S: 250 OK
C: QUIT
S: 221 OK
For more examples, RFC5321 Appendix D shows more interaction scenarios between mail client and server.
Notice the following:
An email client (MUA) may insert additional headers beyond what the sender types. In the above examples, the following headers were automatically added by the client :
Message-ID
,Date
,To
,From:
,Content-Transfer-Encoding
,Content-Language
Exactly one blank (line 21) separates the header lines from the email body
The email client (MUA) marks the end of the email body with a DOT by itself (line 24), "End of Message" marker. The user did not type this DOT as part of the original message. If, somehow, the user's message includes a single DOT on a line by itself, a compliant MUA client will transform that to
..
(two consecutive dots) avoiding the false interpretation for the End Of Message marker.In the above illustration, lines 14-24 above are presented in a format easier for humans to read, but in reality these lines are delivered as a long stream of 300+ bytes, presented below in 50-character rows of text with
- \r in place of the Carriage Return character (ASCII 0x0D)
- \n in place of the LineFeed character (ASCII 0x0A)
Message-ID: 7123-dd-fc62\r\nDate: Thu, 16 May 2024
10:22:37 -0500\r\nTo: you@mail.app\r\nFrom: Mason
Engelberg <me@foo.net>\r\nSubject: Breaking News\r
\nContent-Transfer-Encoding: 7bit\r\nContent-Langu
age: en-US\r\n\r\nHave you heard about NUSA? An em
ail server which will never fill up your mailbox?\r
\n.\r\n
TIP
In RFC5321, these two special characters are denoted as CR and LF. And SP denotes the space character (ASCII 0x20)
Program Requirements
After receiving the "end of message" marker (line 24) above, a real Message Submission Agent will ship the message to the destination SMTP server. Your MSA implementation will not push the incoming messages to the actual mail transport network. It will perform the following task:
Verify the recipient address (provided in RCPT TO) satisfy the following rules for a valid email:
- Ends with
.com
,.org
,.net
,.edu
,.io
,.app
- It contains exactly one
@
character separating the username and domain name - The domain name (the string between
@
and the last.
) is not empty and contains only alphanumeric characters. - The username (the string before the
@
) is not empty
When any of these rules is not satisfied, the MSA must respond with 550 status that includes the nature of the error after the numeric code, such as
550 Unknown TLD
- Ends with
Verify the subject line (embedded in the message header) is not blank. The RFC5321 standard specifies several possible error codes when handling the
DATA
command: 450, 451, 452, 550, 552, and 554. Among these options, 451 error code seems to be the most appropriate.Verify that the client does not attempt to send the message to more than five recipient. Otherwise, respond with 550 status that includes the nature of the error after the numeric code, such as
550 Too many recipients
.Print the entire message body to stdout. Your code shall be able to consume message body beyond the buffer size used for reading incoming bytes from the socket.
In addition, your MSA implementation must be able to handle
- message of any size, specifically your program should not have a limit on the size of message body provided between the
DATA
command and the end of message marker. Recall that your program is not required to save the message body anywhere, but it must be able to "consume" any amount of bytes from its socket and correctly - any type of attachment in the message body
- multiple email transactions from clients, i.e. you can send multiple emails without restarting the server
Extra Credit
For an extra credit, parse the message body for attachments and count the number of attached files. Respond with an error code 550 if too many files (> 5) are attached in the message.
Refer to RFC 2045 to understand how attachments are identified both in the message header and message body. This document is actually the first part of five:
- RFC2045: MIME Part 1: Format of Internet Message Bodies
- RFC2046: MIME Part 2: Media Types
- RFC2047: MIME Part 3: Message Header Extensions for Non-ASCII Text
- RFC2048: MIME Part 4: Registration Procedure
- RFC2049: MIME Part 5: Conformance Criteria and Examples
You may be interested in reading Part 1 and Part 5, and skip Part 2-Part 4. In particular, Part 5 shows an example of embedding attachments in the message body.
Program Design
Your MSA program shall be implemented in Python 3.10 (or newer) and communicates with a real email client. You are welcome to use any email client during development, but your instructor will test your implementation using Thunderbird Email Client
- The MSA program will be listening on port 9000 for incoming connections
- It shall be designed to handle several MUA client connections simultaneously. Use Python thread and use a new thread to execute the function that handles individual client interactions. Refer to the section Using Thread below.
Starter Code
from socket import socket, AF_INET, SOCK_STREAM, SOL_SOCKET, SO_REUSEADDR
# Create a TCP socket that listens to port 9000 on the local host
welcomeSocket = socket(AF_INET, SOCK_STREAM)
welcomeSocket.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
welcomeSocket.bind(("", 9000))
welcomeSocket.listen(4) # Max backlog 4 connections
print ('Server is listening on port 9000')
# Accept incoming connection from client
connectionSocket, addr = welcomeSocket.accept()
print ("Accept a new connection", addr)
# Your program will now communicate with the client
# via the connectionSocket
# Send the "220 response"
connectionSocket.sendall("220 abcdefghijklm".encode())
# Read AT MOST 1024 bytes from the socket
# decode(): converts bytes to text
# encode(): convert text to bytes
text = connectionSocket.recv(1024).decode()
print (f"Incoming text is {text}")
connectionSocket.close()
welcomeSocket.close()
print("End of server")
Network Traffic Inspection
It is strongly recommended that you use Wireshark as part of your debugging, especially to diagnose issues related to exchange of commands between your MSA program and the email client. Since your MSA program will be listing to port 9000 on the localhost, your Wireshark should capture traffic on the "loopback" interface, i.e. the interface with IPv4 address 127.0.0.1 or IPv6 address ::1.
Furthermore, to declutter the captured network traffic to show only those to and from port 9000, you can apply the filter expression tcp.port == 9000
.
Testing & Debugging
You are welcome to use for favorite email client to test your MSA program, but your instructor will be using Thunderbird to grade your work.
Thunderbird Setup
Account Setup
- Enter Your full name
- Enter Email Address:
xxx@test.com
(it does not have to have an existing account) - Press "Configure manually"
- Skip the settings for "Incoming Server" and "Outgoing Server"
- Click "Advanced config" and when the "Confirm Advanced Configuration" dialog shows up, click "OK" to proceed
- The account setting page should show your
xxx@test.com
account
Account Setting (xxx@test.com)
Under "Server Settings"
- Disable "Check for new message at startup"
- Disable "Check for new messages every X minutes"
- Disable "Allow immediate server notifications when new messages arrive"
Under "Copies & Folders"
- Disable "Place a copy in"
- Disable "Keep message archives in"
Under Outgoing Server (SMTP) settings, edit the following
- Server Name:
localhost
- Port: 9000
- Connection security: None
- Authentication method: No authentication
Verify Thunderbird Configuration
To check that Thunderbird is configured correctly:
Run Wireshark
- Capture traffic on localhost
- Apply filter on TCP port 9000
Run the starter program
python starter.py Server is listing on port 9000
Start Thunderbird client and create a new message and click the "Send" button, the starter program should respond with
Accept a new connection ("127.0.0.1", zzzzz) # zzzzz is the TCP port number
Since the starter code is not a fully-functional MSA, Thunderbird will get stuck waiting for RFC5321 compliant response from your server.
Press the "Cancel" button to close the dialog.
Confirm that Wireshark captured the TCP packet that delivers the text
This is a sample text
(from line 17 of the starter code)
Using Thread
In the following snippet, the function someWork
execute in a new thread which run concurrently with the main thread. The output from both prints will intersperse
from threading import Thread
from time import sleep
def someWork(limit):
for k in range(limit):
print(f"In function {k}")
sleep(0.4) # 400 milliseconds
def main:
# Args must be a list
t1 = Thread(target = someWork, args=(51,))
t1.start()
for m in range(20):
print(f"In Main {m}")
sleep(0.25) # 250 milliseconds
Testing Your MSA Functionalities
In addition to the "normal use cases", be sure to test your program using "out of ordinary" scenarios:
Invalid recipient emails address format
Missing subject lines
Sending a message that includes a DOT (by itself) somewhere in the beginning and middle of the message body.
Too many recipients
Large message body whose size exceeds the size of your "read buffer" as specified in the socket
recv
function call:______.recv(1024).decode()
Attach a variety of file types in the email message.
Grading Rubrics (Tentative)
Item | Point |
---|---|
Socket setup & clean up | 4 |
Validate the email syntax of recipient | 4 |
Validate the subject line exists and not empty | 4 |
Limit 5 message recipients | 4 |
Consume and print the entire message body | 4 |
Use multithreading | 5 |
Server handles multiple email transactions | 3 |
Limit 5 attachments (Extra Credit) | 5 |