{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Sets in Python

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Welcome! This notebook will teach you about the sets in the Python Programming Language. By the end of this lab, you'll know the basics set operations in Python, including what it is, operations and logic operations.

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", " \n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Table of Contents

\n", "
\n", " \n", "

\n", " Estimated time needed: 20 min\n", "

\n", "
\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Sets

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Set Content

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A set is a unique collection of objects in Python. You can denote a set with a curly bracket {}. Python will automatically remove duplicate items:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Create a set\n", "\n", "set1 = {\"pop\", \"rock\", \"soul\", \"hard rock\", \"rock\", \"R&B\", \"rock\", \"disco\"}\n", "set1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The process of mapping is illustrated in the figure:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " You can also create a set from a list as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "scrolled": true }, "outputs": [], "source": [ "# Convert list to set\n", "\n", "album_list = [ \"Michael Jackson\", \"Thriller\", 1982, \"00:42:19\", \\\n", " \"Pop, Rock, R&B\", 46.0, 65, \"30-Nov-82\", None, 10.0]\n", "album_set = set(album_list) \n", "album_set" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us create a set of genres:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Convert list to set\n", "\n", "music_genres = set([\"pop\", \"pop\", \"rock\", \"folk rock\", \"hard rock\", \"soul\", \\\n", " \"progressive rock\", \"soft rock\", \"R&B\", \"disco\"])\n", "music_genres" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Set Operations

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us go over set operations, as these can be used to change the set. Consider the set A:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Sample set\n", "\n", "A = set([\"Thriller\", \"Back in Black\", \"AC/DC\"])\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " We can add an element to a set using the add() method: " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Add element to set\n", "\n", "A.add(\"NSYNC\")\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " If we add the same element twice, nothing will happen as there can be no duplicates in a set:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Try to add duplicate element to the set\n", "\n", "A.add(\"NSYNC\")\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " We can remove an item from a set using the remove method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Remove the element from set\n", "\n", "A.remove(\"NSYNC\")\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " We can verify if an element is in the set using the in command:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Verify if the element is in the set\n", "\n", "\"AC/DC\" in A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Sets Logic Operations

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember that with sets you can check the difference between sets, as well as the symmetric difference, intersection, and union:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Consider the following two sets:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Sample Sets\n", "\n", "album_set1 = set([\"Thriller\", 'AC/DC', 'Back in Black'])\n", "album_set2 = set([ \"AC/DC\", \"Back in Black\", \"The Dark Side of the Moon\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "scrolled": true }, "outputs": [], "source": [ "# Print two sets\n", "\n", "album_set1, album_set2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As both sets contain AC/DC and Back in Black we represent these common elements with the intersection of two circles." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can find the intersect of two sets as follow using &:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Find the intersections\n", "\n", "intersection = album_set1 & album_set2\n", "intersection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can find all the elements that are only contained in album_set1 using the difference method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Find the difference in set1 but not set2\n", "\n", "album_set1.difference(album_set2) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You only need to consider elements in album_set1; all the elements in album_set2, including the intersection, are not included." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The elements in album_set2 but not in album_set1 is given by:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "album_set2.difference(album_set1) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also find the intersection of album_list1 and album_list2, using the intersection method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Use intersection method to find the intersection of album_list1 and album_list2\n", "\n", "album_set1.intersection(album_set2) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " This corresponds to the intersection of the two circles:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The union corresponds to all the elements in both sets, which is represented by coloring both circles:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " The union is given by:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "# Find the union of two sets\n", "\n", "album_set1.union(album_set2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And you can check if a set is a superset or subset of another set, respectively, like this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Check if superset\n", "\n", "set(album_set1).issuperset(album_set2) " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Check if subset\n", "\n", "set(album_set2).issubset(album_set1) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is an example where issubset() and issuperset() return true:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Check if subset\n", "\n", "set({\"Back in Black\", \"AC/DC\"}).issubset(album_set1) " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Check if superset\n", "\n", "album_set1.issuperset({\"Back in Black\", \"AC/DC\"}) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Quiz on Sets

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the list ['rap','house','electronic music', 'rap'] to a set:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'electronic music', 'house', 'rap'}" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Write your code below and press Shift+Enter to execute\n", "set(['rap','house','electronic music', 'rap'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click __here__ for the solution.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Consider the list A = [1, 2, 2, 1] and set B = set([1, 2, 2, 1]), does sum(A) = sum(B) " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "the sum of A is: 6\n", "the sum of B is: 3\n" ] } ], "source": [ "# Write your code below and press Shift+Enter to execute\n", "A = [1, 2, 2, 1]\n", "B = set([1, 2, 2, 1])\n", "print(\"the sum of A is:\", sum(A))\n", "print(\"the sum of B is:\", sum(B))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click __here__ for the solution.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new set album_set3 that is the union of album_set1 and album_set2:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "{'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Write your code below and press Shift+Enter to execute\n", "\n", "album_set1 = set([\"Thriller\", 'AC/DC', 'Back in Black'])\n", "album_set2 = set([ \"AC/DC\", \"Back in Black\", \"The Dark Side of the Moon\"])\n", "album_set3 = album_set1.union(album_set2)\n", "album_set3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click __here__ for the solution.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find out if album_set1 is a subset of album_set3:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Write your code below and press Shift+Enter to execute\n", "album_set1.issubset(album_set3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click __here__ for the solution.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

The last exercise!

\n", "

Congratulations, you have completed your first lesson and hands-on lab in Python. However, there is one more thing you need to do. The Data Science community encourages sharing work. The best way to share and showcase your work is to share it on GitHub. By sharing your notebook on GitHub you are not only building your reputation with fellow data scientists, but you can also show it off when applying for a job. Even though this was your first piece of work, it is never too early to start building good habits. So, please read and follow this article to learn how to share your work.\n", "


" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Get IBM Watson Studio free of charge!

\n", "

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

About the Authors:

\n", "

Joseph Santarcangelo is a Data Scientist at IBM, and holds a PhD in Electrical Engineering. His research focused on using Machine Learning, Signal Processing, and Computer Vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other contributors: Mavis Zhou" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Copyright © 2018 IBM Developer Skills Network. This notebook and its source code are released under the terms of the MIT License.

" ] } ], "metadata": { "kernelspec": { "display_name": "Python", "language": "python", "name": "conda-env-python-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 4 }