SQL Candy Assignment PLEASE READ ATTACHMENT FIRST FOR DETAILED INSTRUCTIONS AND SCREENSHOTS. The details for this assignment are attached. Here are the

SQL Candy Assignment PLEASE READ ATTACHMENT FIRST FOR DETAILED INSTRUCTIONS AND SCREENSHOTS.

The details for this assignment are attached. Here are the

Click here to Order a Custom answer to this Question from our writers. It’s fast and plagiarism-free.

PLEASE READ ATTACHMENT FIRST FOR DETAILED INSTRUCTIONS AND SCREENSHOTS.

The details for this assignment are attached. Here are the basics:

Attachments:

· 2017_product_data_students-final.csv

· 2018_product_data_students-final.csv

· 2019_product_data_students-final.csv

· Candy_part_1_skeleton_for_students.SQL

Your company wants to merge its old product order data into a new data mart to facilitate analysis. You have been tasked with writing an ETL (extract, transform, and load) code sequence, and executing it on three years’ worth of order data. 

In this assignment, you will produce SQL code which scrubs and imports each of the three years’ worth of data, and produces an output file called stagingTable.

Along with these instructions, there is another document, ‘Additional Clarification on the Week 6 Candy Assignment’. Please read that document carefully.

 You should also read the ‘Data Notes’ in part 3 of this document. It is very important that you understand the data and how the data changes over the three years, so you can create a ‘stagingTable’ the effectively combines the data that might have been captured in different ways over the years.

Let’s get started!

Part A: Upload all the files you will need to SQLlite:

  

1. Import the file called “2017_product_data_students.csv” to SQLiteonline.com.  When you import it, give it the table name “pd2017” (no quotes) and set the column name to “First line.”

2. Import “2018_product_data_students.csv” as “pd2018”

3. Import “2019_product_data_students.csv” as “pd2019”

4. If you SELECT * FROM pd2017, you should see something like the below screenshot. Note you should see all three of the import tables on the left, and the pd2017 data should match what is shown as selected.

  

Part B: Extract and Transform your data

Your job is to use SQL to perform an ETL which will accomplish the following:  INSTRUCTIONS IN ATTACHMENT.

1. Start with the skeleton starter script we give you, attached to this assignment. Modify the CREATE TABLE command so the schema is as follows: SEE ATTACHMENT

2. Get the 2017 bit of the script working. SEE ATTACHMENT

3. Get the 2018 part of the script working. SEE ATTACHMENT

4. Get the 2019 part of the script working. SEE ATTACHMENT

5. The script will load it into one final table and call it stagingTable

6. Run the checksum script to verify you have the stagingTable calculated correctly.

7. Export your final output table under the name “XX_output_final.csv” where XX are your initials.  To export this, you can just use the Export button on the SQLlite menu (it’s right next to the Import button.)

You should do this all in SQLlite. You should not export to Excel and do your manipulations in Excel.

Part C: 2017 Data Notes

Your order 2017 data is contained in the attached file, “2017_product_data_students.csv” and you should have imported it as “pd2017.” A sample of this file’s type of data is contained below in Table 1 Sample of order data from 2017. (Note your file may or may not have the same data in it.)

Your field definitions follow:

· Month: integer, corresponds to the month of the sale. For example, 5 = May.

· Country: text, should all be USA. (All data in this exercise should be USA.)

· Region: text, represents the regions within the country.

· State: text, USPS state abbreviations. Each state is within one region.

· Product: text. This is the name of a packaged food product.

· Per-unit price: integer. This represents the per-unit price in cents; for example, 300 indicates that Orange Creepies sell for $3.00 per package. (For the purposes of this exercise, disregard all currency formatting and just use 300 to represent $3.00.)

· Quantity: integer. This represents how many items were in that particular order. The first order here was for 49 packages of Orange Creepies.

· Order Total: integer. This is the per-unit price x the quantity. The first line here indicates that 300 x 49 = 14700 (or $147.00) was the price of the first order.

Table 1 Sample of order data from 2017 – SEE ATTACHMENT

2018 Data Notes:

Your order 2018 data is contained in the attached file, “2018_product_data_students.csv”

A sample of this file’s data is contained below as Table 2 Sample of order data from 2018. (Note your file may or may not have the same data in it.)

Your field definitions follow:

· Month: integer, corresponds to the month of the sale. For example, 5 = May.

· Region: text, represents the regions within the country.

· Customer_ID: integer, represents the customer’s unique Customer ID number.

· Product: text. This is the name of a packaged food product. 

· Per-unit price: integer. This represents the per-unit price in cents; for example, 363 indicates that PearApple sells for $3.63  per package. (For the purposes of this exercise, you should disregard all currency formatting and just use 363 to represent $3.63.)

· Quantity_1: integer. This represents how many items were in the first shipment of that particular order. This year we had shipping problems, and could often not ship the entire order all at once. Orders were split into two shipments where necessary, and Quantity_1 reflects how many units were shipped first. (Assume all shipments were completed in the month listed, and that no shipments had the first shipment in one month and the second shipment in the subsequent month.) 

· Quantity_2: integer. This represents how many items were in the second shipment of that particular order. A 0 indicates a second shipment was not necessary. To get the total number of items shipped, you need to add Quantity_1 and Quantity_2.

· The first line here reflects that PearApple has a first shipment of 25 units, and a second shipment of 92 unit, all within the month of January, for a total of 25 + 92 = 117 units. 

Table 2 Sample of order data from 2018 – SEE ATTACHED

  

2019 Data Notes:

Your order 2019 data is contained in the attached file, “2019_product_data_students.csv.”

A sample of this file’s data is contained below as Table 3 Sample of order data from 2019. (Note your file may or may not have the same data in it.)

Your field definitions follow:

· Month: integer, corresponds to the month of the sale. For example, 5 = May.

· Country: text, represents the country of the customer. Should all be USA.

· Region: text, represents the regions within the country.

· State: USPS code for the 50 United States.

· Product: text. Same as previous years.

· Per-unit price: integer. This represents the per-unit price in cents; same as previous years.

· Quantity: This represents how many items were in that particular order. The first order here was for 95 packages of Only Pancakes.

· Order Subtotal: This represents the order subtotal, calculated as per-unit price x quantity. For example, the first order here reflects a per-unit price of 413 cents x 95 units, for a subtotal of 39,235 (or $392.35). 

· Quantity Discount: This represents the new policy (effective January 1, 2019) that all orders 90 units and over will automatically earn a 10% discount. An order of 89 units does not earn the discount; an order of 90 units does earn the discount. All order discounts have been rounded to the nearest penny, so you can assume this field has no decimals in it. In the data below, 

o Order 0, on the first line, of 95 Only Pancakes to Florida, did qualify for the Quantity Discount, because an order quantity of 95 exceeded the 90 threshold. The Quantity Discount has been computed as 3924, or 10% of 39235. In this case, the final order total would be 39,235 – 3,924 = 35,311 (or $353.11).

o Order 4, on the fifth line, of 31 Future Toasts to North Carolina, did not qualify for the Quantity Discount. Therefore, the Order total would simply be the Order subtotal.

Table 3 Sample of order data from 2019 – SEE ATTACHED

  

Part D: Check Your Own Work

1.  You can run the following SQL code on your staging table. There is nothing to turn in from this bit. It should yield the following first few rows:

Select region, yearint, monthInt, count(*) from stagingTable where monthInt = 5 group by region, yearInt, monthInt;

2. You can also run the following code to debug. You should get the following rows:

Select yearInt, monthInt, state, customer_id, product_name, orderTotal from stagingTable 

where product_name = ‘Big Waffle’ and monthint=4

order by product_name, yearInt, monthInt, state, customer_id, orderTotal;

  

Now that you’ve debugged your code, it’s time to get a checksum! Run the following code to get a checksum. The checksum will be a number. Put this checksum number on the top of your homework. See table below for help with your CHECKSUM result. 

select sum(yearInt * monthInt * orderTotal)%2341 as checksum from stagingTable;

3. Once you get the result of your CHECKSUM look at table below for ways to troubleshoot any issues with your ETL statements. SEE ATTACHMENT

TURN IN:

1. Your output file, called “XX_output_final.csv” where XX are your initials.

2. All the SQL code you used to execute this.

3. A document that contains

a. CHECKSUM: XXX where XXX is the checksum number produced. Put this in big font right on the top.

b. A one page outline of your ETL process. Which functions did you use, and what logic did you follow? This should be at the level that your boss, who has an MBA but not an IT/database background, can follow it. Do not use “computer-ese” here; use regular business English.

Month Country Region State Product Per-Unit Price Quantity Order Subtotal Quantity Discount
0 9 USA South FL Only Pancakes 413 95 39235 3924
1 6 USA West HI Big Waffle 268 93 24924 2492
2 5 USA Northeast RI Grey Gummies 446 95 42370 4237
3 9 USA Midwest NE Funky Pops 380 100 38000 3800
4 11 USA South NC Future Toast 253 31 7843 0
5 8 USA West WA Mr Greens 447 76 33972 0
6 8 USA South MD Giant Gummies 347 93 32271 3227
7 1 USA South TN Power Gels 193 11 2123 0
8 1 USA South FL Red Hot Chili Peppers 233 24 5592 0
9 12 USA South DE Nap Be Gone 500 67 33500 0
10 2 USA Northeast MA Cheezos 372 63 23436 0
11 1 USA South NC GMO Guardian 176 77 13552 0
12 4 USA South DE Only Pancakes 413 86 35518 0
13 3 USA South KY Rotpunkt 220 20 4400 0
14 2 USA Midwest IL Red Hot Chili Peppers 233 82 19106 0
15 8 USA Midwest IL Panda Gummies 104 33 3432 0
16 11 USA Midwest NE Yellow Zonkers 134 29 3886 0
17 7 USA West CA Grey Gummies 446 97 43262 4326
18 11 USA South NC Future Toast 253 93 23529 2353
19 10 USA Midwest MO Red Hot Chili Peppers 233 77 17941 0
20 12 USA South KY Orange Creepies 300 88 26400 0
21 4 USA West CA Big Waffle 268 7 1876 0
22 3 USA West NV Cheezos 372 77 28644 0
23 7 USA West OR Power Gels 193 15 2895 0
24 2 USA Northeast VT PearApple 363 57 20691 0
25 1 USA South NC GMO Guardian 176 35 6160 0
26 3 USA South FL Meat B Gone 495 34 16830 0
27 9 USA South TN Yellow Zonkers 134 43 5762 0
28 1 USA West CA Ultra-Salty Chips 454 34 15436 0
29 10 USA South MD Farm Fresh 365 14 5110 0
30 6 USA South TN Red Rock Candy 189 55 10395 0
31 6 USA West WA Panda Gummies 104 19 1976 0
32 6 USA South MD Meat B Gone 495 91 45045 4505
33 12 USA West CA GMO Guardian 176 46 8096 0
34 4 USA South NC Big Waffle 268 83 22244 0
35 1 USA South TN Mr Greens 447 17 7599 0
36 4 USA South MD Meat B Gone 495 54 26730 0
37 8 USA South SC Purple Pain 105 12 1260 0
38 4 USA South FL Crocodile Tears 143 40 5720 0
39 4 USA West CA Big Waffle 268 2 536 0
40 2 USA Northeast VT Big Waffle 268 68 18224 0
41 9 USA Northeast MA Ultra-Salty Chips 454 29 13166 0
42 10 USA West CA Funky Pops 380 31 11780 0
43 3 USA South TN Farm Fresh 365 43 15695 0
44 12 USA South MD Orange Creepies 300 100 30000 3000
45 6 USA South DE Mr Greens 447 33 14751 0
46 1 USA Northeast MA Power Gels 193 63 12159 0
47 11 USA South FL Only Pancakes 413 65 26845 0
48 5 USA West OR Farm Fresh 365 79 28835 0
49 2 USA West AZ Farm Fresh 365 36 13140 0
50 11 USA South FL Just The Fruit 160 85 13600 0
51 12 USA Midwest NE Organiks 257 23 5911 0
52 4 USA West HI Big Waffle 268 87 23316 0
53 7 USA South MD Just The Fruit 160 27 4320 0
54 4 USA South DE Rotpunkt 220 88 19360 0
55 12 USA West AZ Mr Greens 447 42 18774 0
56 7 USA West NV Meat B Gone 495 38 18810 0
57 3 USA West OR Red Rock Candy 189 68 12852 0
58 3 USA West OR Ultra-Salty Chips 454 89 40406 0
59 5 USA Midwest NE Grey Gummies 446 12 5352 0
60 11 USA South FL Panda Gummies 104 96 9984 998
61 5 USA Midwest MO Red Rock Candy 189 77 14553 0
62 3 USA West NV Just The Fruit 160 40 6400 0
63 3 USA Northeast MA Purple Pain 105 3 315 0
64 2 USA South NC Only Pancakes 413 66 27258 0
65 11 USA West ID GMO Guardian 176 30 5280 0
66 3 USA Northeast RI Yellow Zonkers 134 16 2144 0
67 3 USA Northeast MA Giant Gummies 347 7 2429 0
68 12 USA West WA Yellow Zonkers 134 49 6566 0
69 10 USA West WA Purple Pain 105 9 945 0
70 12 USA Northeast RI Red Hot Chili Peppers 233 17 3961 0
71 7 USA West AZ Giant Gummies 347 68 23596 0
72 9 USA Midwest NE Red Rock Candy 189 8 1512 0
73 11 USA South SC Future Toast 253 98 24794 2479
74 1 USA West AZ Mr Greens 447 74 33078 0
75 3 USA South MD Mr Greens 447 13 5811 0
76 10 USA South NC Power Gels 193 96 18528 1853
77 1 USA West OR Rotpunkt 220 58 12760 0
78 6 USA South TN Grey Gummies 446 72 32112 0
79 3 USA West OR Mr Greens 447 73 32631 0
80 2 USA West CA Giant Gummies 347 25 8675 0
81 4 USA Midwest MO PearApple 363 4 1452 0
82 12 USA Northeast RI Big Waffle 268 25 6700 0
83 6 USA Northeast VT Just The Fruit 160 17 2720 0
84 10 USA Midwest MI Grey Gummies 446 2 892 0
85 4 USA South MD Future Toast 253 65 16445 0
86 7 USA South TN Purple Pain 105 41 4305 0
87 1 USA West WA Mr Greens 447 98 43806 4381
88 5 USA South DE PearApple 363 39 14157 0
89 11 USA Midwest NE Panda Gummies 104 2 208 0
90 8 USA Northeast MA PearApple 363 65 23595 0
91 12 USA South MD Ultra-Salty Chips 454 16 7264 0
92 11 USA West WA Just The Fruit 160 78 12480 0
93 9 USA South FL Organiks 257 100 25700 2570
94 9 USA South TN Ultra-Salty Chips 454 76 34504 0
95 5 USA South FL Nap Be Gone 500 74 37000 0
96 12 USA West CA Organiks 257 73 18761 0
97 8 USA Midwest IL Cheezos 372 84 31248 0
98 4 USA Midwest MO PearApple 363 26 9438 0
99 7 USA South DE Organiks 257 68 17476 0
100 9 USA South TN Green Lightning 161 36 5796 0
101 11 USA Northeast VT PearApple 363 69 25047 0
102 12 USA South SC Farm Fresh 365 64 23360 0
103 6 USA South FL PearApple 363 42 15246 0
104 8 USA Midwest MI Funky Pops 380 79 30020 0
105 6 USA South NC Yellow Zonkers 134 25 3350 0
106 9 USA West WA Mr Greens 447 94 42018 4202
107 4 USA Midwest MO Green Lightning 161 91 14651 1465
108 2 USA West CA Future Toast 253 5 1265 0
109 1 USA South MD Power Gels 193 86 16598 0
110 5 USA West OR Orange Creepies 300 76 22800 0
111 11 USA Midwest MO Purple Pain 105 12 1260 0
112 2 USA Midwest NE Red Rock Candy 189 5 945 0
113 1 USA South FL GMO Guardian 176 75 13200 0
114 1 USA Northeast MA Red Hot Chili Peppers 233 20 4660 0
115 6 USA Northeast MA Orange Creepies 300 58 17400 0
116 10 USA South KY Giant Gummies 347 85 29495 0
117 5 USA South NC GMO Guardian 176 39 6864 0
118 7 USA South TN Purple Pain 105 4 420 0
119 5 USA West OR Only Pancakes 413 89 36757 0
120 3 USA South FL Orange Creepies 300 33 9900 0
121 5 USA West CA Nap Be Gone 500 26 13000 0
122 5 USA Northeast RI Mr Greens 447 71 31737 0
123 7 USA South FL Future Toast 253 80 20240 0
124 9 USA South DE PearApple 363 2 726 0
125 3 USA South DE GMO Guardian 176 7 1232 0
126 1 USA Midwest MI GMO Guardian 176 27 4752 0
127 5 USA South NC Giant Gummies 347 76 26372 0
128 11 USA South FL Organiks 257 39 10023 0
129 2 USA Midwest IL Grey Gummies 446 65 28990 0
130 12 USA South KY Grey Gummies 446 31 13826 0
131 3 USA West ID Red Rock Candy 189 65 12285 0
132 3 USA West ID Funky Pops 380 53 20140 0
133 4 USA South TN Just The Fruit 160 4 640 0
134 7 USA Midwest IL Future Toast 253 17 4301 0
135 5 USA Midwest MI GMO Guardian 176 36 6336 0
136 10 USA West AZ Green Lightning 161 67 10787 0
137 9 USA West HI Purple Pain 105 17 1785 0
138 7 USA Northeast RI Yellow Zonkers 134 100 13400 1340
139 12 USA Northeast MA Rotpunkt 220 33 7260 0
140 12 USA West HI Green Lightning 161 40 6440 0
141 4 USA Midwest NE Rotpunkt 220 86 18920 0
142 6 USA South MD Orange Creepies 300 8 2400 0
143 1 USA West CA Organiks 257 80 20560 0
144 6 USA Midwest NE Panda Gummies 104 79 8216 0
145 9 USA South MD Big Waffle 268 28 7504 0
146 11 USA Northeast VT Orange Creepies 300 86 25800 0
147 12 USA West NV Farm Fresh 365 63 22995 0
148 6 USA West ID Mr Greens 447 100 44700 4470
149 2 USA Midwest NE Rotpunkt 220 23 5060 0
150 8 USA West WA

Place your order now for a similar assignment and have exceptional work written by one of our experts, guaranteeing you an A result.

Need an Essay Written?

This sample is available to anyone. If you want a unique paper order it from one of our professional writers.

Get help with your academic paper right away

Quality & Timely Delivery

Free Editing & Plagiarism Check

Security, Privacy & Confidentiality