Final Report:
Baxter the Mailman
1)
Introduction
a.
Describe
the end goal of your project.
We hope to achieve mail sorting
automation using the Baxter. In other words, we expect the Baxter to replace an
experienced staff in a local mail station for mail sorting jobs. Hence, the
Baxter should perform mail catching when someone hands him a mail or a small
package, sort the mail according to handwritten zip codes on the surface of the
mails, and finally pile up the mails based on size.
b.
Why
is this an interesting project? What interesting problems did you need to solve
to make your solution work?
As the eCommercial concept gains
popularity in recent years, we believe that online shopping would occupy a
large share of the market in the future. Hence, the problem of delivery goods
stands out. The current efficiency of the delivery industry will not suffice
the growing online shopping demands, and will further ruin customer experience
if delivery takes too long. Therefore, we think using robots to perform mail
sorting automation would significantly increase efficiency.
The problems that are involved with
this project includes Computer Vision for mail recognition and zip code contour
recognition, Machine Learning for mail sorting, camera sensing for AR tag
tracking and path planning and algorithm development for piling up.
c.
In
what real-world robotics applications could the work from your project be
useful?
For the mail industry. The Baxter
could work as a staff in a local mail store, so that it can catch the mails or
the packages from the customers as they hand them to him. This will allow high
efficiency in the client part of the industry. Furthermore, similar automated
sorting processes can also be applied to libraries or warehouses.
2)
Design
a.
What
design criteria must your project meet? What is desired functionality?
The project is designed to include
sensing, path planning and actuation. In more detail, our initial design is to use
the Baxter robot to build a human-robot interactive program where the robot’s
job includes:
·
Sensing
the position of the human using Kinect
·
Moving
the Baxter’s arms to get the mails from the human
·
Recognizing
the hand written zip codes on the mails
·
Putting
the mails to the corresponding regions
·
Pilling
up the mails according to their sizes
b.
Describe
the design you chose.
The final design of the mail sorting
automation project includes 3 steps. The robot first locates the handed mail.
We use AR tags to denote the size of the mails for tracking, and we set up the
camera in the Baxter's arm so that it can read information from the AR tags to
plan a path and catch the mail. And then the robot puts the mail in the correct
region according to the handwritten zip code on the mail. Machine Learning
algorithms are adopted to first recognize the contours of the digits and then
classify the handwritten digits. The classification model is a Neural Network
trained on the MNIST dataset. At last, after all the mails have been sorted,
the Baxter piles each heaps so that the order of each pile is well-organized:
biggest mails at the bottom and smallest mails on the top. Here we developed a
customized algorithm to solve the sorting problem.
c.
(And
for part d.)What design choices did you make when you formulated your design?
What tradeoffs did you have to make? How do these design choices impact how
well the project meets design criteria which would be encountered in a real
engineering application, such as robustness, durability and efficiency?
First, we decided to use AR tags to
track the mail locations. This will be unrealistic in the true mail station
setting. A better approach is to use Computer Vision to determine mail
locations and use Kinect to track human location. However, Kinect
implementation in real setting is too complicated and costly. Kinect is also
not as robust as AR tags in object locating. To improve the efficiency and the
efficacy of the entire process, we ended up using AR tags and regular cameras.
Second, in the pile-up process, we
decided to have preset locations for the mail piles and avoided reading AR tags
repeatedly. This significantly reduces the time the robot needs to locate the
mails and compute the trajectories for the pile-up phase. It also avoids the
problem that the AR tags maybe blocked by the camera view, which could
potentially impede the entire process.
Third, we used the IR sensor from
the robot's arm to determine if the mail is close enough to the gripper. This
design allows us to get closed-loop feedback on the distance between the
gripper and the mail. It improves the robustness and the durability of mail
capturing.
Four, we first wanted to use
clippers to capture the mails. Soon we figured that it would involve moving
both arms of the robot and it would not only increase the time for path
planning but also have a greater chance of not catching mails. Therefore,
suction cup was adopted instead to improve the mail capturing experience.
Results demonstrate effectiveness of suction cup over clippers.
Five, a collision object (table) was
added to the environment to avoid potential collision between the robot and the
table. The table size was set to be smaller than the actual table to make
pile-up work. Imagine having the actual table size as the collision object,
pile-up would fail every time because the gripper would find a collision when trying
to pick up a mail from the table. Adding the approximately collision object
successfully reduces the number of times the robot collide with its surrounding
environment.
3)
Implementation
a.
(And
b.)Describe any hardware you used or built. Illustrate with pictures and
diagrams. What parts did you use to build your solution?
In the project, we are mainly using the Baxter robot and its
accessory kit.
The Baxter Robot(full view):
Camera with max 1280X800 resolution located in the Baxter's left
arm is used for locating the mails and providing pictures for digit
recognition.
A suction cup is installed at the end of the Baxter's left arm for
the purpose of capturing mails.
The screen on the Baxter's head is used to display the output image
of the digit recognition.
The IR sensor on the Baxter's arm is used to determine the distance
between the gripper and the mails.
b.
What
parts did you use to build your solution?(see above)
c.
Describe
any software you wrote in detail. Illustrate with diagrams, flow charts, or
other appropriate visuals. This includes launch files, URDFs, etc.
Source files: these files ensure
that the dependency and the path of the project is correct.
Mailman.py: This file is the main
frame of the working project. It contains human-computer interactive sessions,
in which the program takes in the human's order to perform different tasks.
The first session is the handover
part. In this session, the program tracks the TF messages rendered by the AR
tags, and reacts immediately to begin path planning. We implement the MoveIt!
kit to generates a plausible path by inverse kinetics. After the path has been
calculated, the gripper will move to the destination, and the IR sensor will
detect if the distance between the gripper and the mail is close enough for the
suction cup to catch the mail. If not, the gripper will automatically adjust
its pose by move forward a little bit until it reaches the mail.
The second session is digit
recognition. When catching the package, the camera located on the Baxter's arm captures
and saves a picture of the mail for the program to perform digit recognition.
If the output digits match a preset region, the Moveit! Kit will calculate a
path for the arm to move the mail to the corresponding region.
The third session is pile up. After
all the mails have been assorted to different regions, the program will first
fetch a region to begin pile up. In the mail catching process, the program
records the order of mails in each pile. We coded the mail size information on
the AR tags, so that the sorting process is based on AR tags. Our algorithm
allows three temporary zones on the table, and will perform Hanoi-Tower-like
sorting. After all the piles have been sorted, the program ends.
d.
How
does your complete system work? Describe each step.
4)
Results
a.
How
well did your project work? What tasks did it perform?
(Calculation of rates were based on the video we filmed where we
perform the entire process for multiple times.)
Our project involves five tasks:
1. Path planning:
We use the Moveit! kit and Baxter simulator as main tools to
calculate the path. To help the path planning program run stably, the preset
poses of Baxter in our program have been carefully calibrated, and the poses
enormously reduced the failure rate of generating a valid path. Our project was
carried out on the lab computer with 2 GiB memory.
Success rate
|
87%
|
2. Contour recognition:
Contour recognition is a complicated task because any other object
in the frame may be recognized as a valid contour. Therefore, in order to
reduce the contour recognition accuracy, we devised a specific pose for human
to hand the mail and the handwritten digit sizes are consistent throughout all
the sample mails for this project. Fortunately, in this simplified setting,
contour recognition gives reasonably well performance.
Success rate
|
82%
|
3. Handwritten digit recognition:
For this part, we used Machine Learning to recognize digits, and we
encountered a problem. The handwritten digit recognition works very well with
all other digits except "9".
It was because in our hand-writing, we usually choose the neglect
the hook part on the bottom of the digit "9", causing the program to
fail and output "1". This problem could be fixed by writing
"9" in a print style, but that's definitely not what we want. It
stands oppositely on the original purpose of handwritten digit recognition. To
solve this issue, we considered the situation in real life, that the Baxter has
limited arm length and hence could sort mails with limited range of zip codes.
Therefore, we decided only to track last three digits of the zip codes,
reducing the failure rate of the Baxter.
General success rate
|
74%
|
Failure rate on digit
"9"
|
89%
|
Average time to perform digit
recognition
|
3.2s
|
4. Suction cup + IR closed loop sensing:
The suction cup and IR closed loop sensing are originally built in
the Baxter, and we discovered that they worked stably and almost never failed.
Success rate of suction cup
|
99%
|
Success rate of IR sensing
|
99%
|
4. hand-over:
The hand-over part is a combo of all above processes. Except from
the problems mentioned above, we encountered another problem-- the Baxter arm
will overmove and hit the table(or wall/wires on the robot).We soon realized it
was a severe issue and decided to recalibrate the configuration of the Baxter.
Fortunately after reconfiguration, the failure rate was significantly reduced.
Success rate
(Every single handover is counted)
|
45 out of 78
|
Path not found failures
|
5 out of 78
|
Collision failures
|
28 out of 78
|
5. Pile-up:
In the pile up process, the algorithm we developed works well, in
the sense that the robot can figure out the correct pick up and drop off
locations for each mail. However, like hand-over process, we still met the
problem of overmove. To resolve this, we carefully calibrate the size of the
table, ensuring a successful path should be found as well as the path does not
collide with the real table.
Success rate(counting each mail
move)
|
102 out of 136
|
Overmove failures
|
34 out of 136
|
b.
Illustrate
with a video and pictures.
5)
Conclusion
a.
Discuss
your results. How well did your finished solution meet your design criteria?
Despite the data listed above, the
success rate is in fact incremented as we continuously fixed bugs and
recalibrated the Baxter. On the last 3 tests, our process ran more smoothly and
successfully avoiding most kinds of bugs.
Here is a form discussing each
design criteria.
Sensing
|
For mail locating by camera, our
implementation--AR tags and TF messages provide messages rendered to path
planning and digit recognition. The AR tags also provides the possibility for
the robot to interact with human. The IR sensing was adopted to carefully
measure the distance between the object and the gripper, and to ensure the
suction cup can work properly.
|
Moving
|
We are using the pre-installed
Moveit! Kit to perform path planning. Despite some unexpected failures(i.e.
hit table, arm entangled with wires), as we reconfigure and recalibrate the
robot, its arm moves smoothly and the gripper is able to catch the mail from
human as well as pick up mail from the table.
|
Recognizing
|
We adopted Machine Learning
techniques and used the picture rendered by the camera. Although the
algorithm failed to recognize the digit "9", all other digits were
correctly classified to mostly match the data we preset.
|
Putting
|
The regions were set manually in
order to ensure a path could be found. Since we can get a region-matching zip
code from the recognition part, the Moveit! kit can calculate a path avoiding
collisions and the arm can accurately put the mail to the corresponding
region.
|
Path Planning
|
The data above shows that we
encountered some difficulties using the Moveit! kit to calculate the path. In
the process, almost every step requires a path for the Baxter's arm to move,
and hence it is the most important part of the project. We adopted all kinds
of implementations in order to increase the success rate of path planning,
i.e. tuning parameters of the Moveit! Kit and preset poses of arms.
Fortunately, the last few tests showed that path planning worked stably
without any failures. The finished solution ensures robust user experience
with the Baxter.
|
b.
Did
you encounter any particular difficulties?
As mentioned above, we encountered
problems during the path-planning phase. At the start, the path planning
tool—Moveit! kit kept failing and could not return a valid path. We considered
about all surroundings that will influence the path plan and also thought about
the initial configuration of the program as well as the Baxter’s arm. And later
we discovered that if we gave the program more time and attempts, it will more
likely to find a valid trajectory for the arm to move, so we set the runtime of
the program to be longer than original. Also, we discovered that in the piling
up phase, the arm refused to move close to the table because the program
rejected the paths that collide with the obstacles, and hence we set the size
of the table to be smaller than actual.
And then in the real movement, the
arm kept hitting the table despite that the path shown by the program avoided
the obstacle. Another group helped us recalibrate the Baxter’s arm and
reconfigure the settings. Thanks to their help we were finally able to avoid
this issue.
Also, we encountered a problem
during the digit recognition. The digit “9” could not be recognized because in
our handwriting, we neglected the hook of the digit on the bottom. This causes
the program to read “9” as “1”. This problem could be fixed by writing “9” in a
print style, but that just stands on the opposite point of handwritten digit
recognition. Adding some of our “9”s in the training set may help the
classifier to learn better, however, this would not be a valid approach in the
real mail sorting setting. So we decided to track only the last three digits of
the zip codes. This is closer to the real life engineering setting because a
robot will not be able to sort all the zip codes in a mail sorting facility.
Instead he will only sort a range of zip codes, and the front 2 digits will all
be the same for a region. The last three digits helped a lot in increasing the
success rate.
c.
Does
your solution have any flaws or hacks? What improvements would you make if you
had additional time?
Preset poses were set to avoid path
planning failures. In this part we hard-coded the poses, but a better and more
general approach could be adopted where we don't need to preset any poses and
the Baxter will decide himself the regions to drop mails.
We still have difficulty recognizing
the mail size using Computer Vision since the mails can be handed to the robot
from different distances, so we used AR tag with coded sizes. In the future, in
the image processing process we might be able to find a formula to compute the
mail size by the contour of the mail and the IR sensing of distance, so that we
will be able to forsake the AR tags to simplify the hand-over process.
The digit recognition kept failing
on digit "9". In the future we might be able to fix it by adding more
training samples to distinguish digits "1" and "9".
6)
Team
a.
Names
and short bios of each member of your project
Mo Zhou – She is a graduate student in IEOR, under the supervision
of Ken Goldberg. She has taken courses like CS 189, Stats 215A, etc and has
been working on Machine Learning projects for research.
Jiacheng Wu –She is a computer science major and has taken courses
like CS 170, EE 120, etc…
Chunyu Hou -- She is an aerospace engineering major and she is an
exchange student from Harbin Institute of Technology.
Mingyi Zheng - He is a Mechanical Engineering major and has taken
courses EE C128 and has experience working on Kinect object tracking.
7)
Additional
Materials
a.
Code,
URDFs, and launch files you wrote
b.
CAD
models for any hardware you designed
c.
Datasheets
for components used in your solution
d.
Any
additional videos, images, or data from your finished solution.