Comments on: SQL for Data Analysis – Tutorial for Beginners – ep2 https://data36.com/sql-where-clause-tutorial-beginners-ep2/ Learn Data Science the Hard Way! Fri, 21 Feb 2025 21:29:37 +0000 hourly 1 https://wordpress.org/?v=6.7.4 By: Tomi Mester https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-191047 Mon, 02 Nov 2020 15:00:06 +0000 https://data36.com/?p=1073#comment-191047 In reply to Mariska.

hey Mariska,

I’m not 100% sure what’s the original task based on your description but there is a missing comparsion operator for sure in your dataset:

SELECT *
FROM flight_delays
WHERE month IN (3,5)
AND dayofweek IN (6,7)
AND ((airtime > 200 AND airtime < 220) OR (airtime > 250 AND airtime < 270));

I suppose this is what you meant. And this query will just run! : )

Hope this helps!
Tomi

]]>
By: Tomi Mester https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-191046 Mon, 02 Nov 2020 14:50:42 +0000 https://data36.com/?p=1073#comment-191046 In reply to Justin H..

hey Justin,

thanks for the question and no worries, that’s a quite typical data cleaning issue. : )
It can be done in bash or Python (I prefer bash.)

In this very particular case this will work:
sed "s/, / /g"

But of course, you’ll have to check this for all lines.
Although it’s a good enough assumption that in comments they’ll use spaces after commas… Anyways, it would get much easier if you could save the file separated by ; or tabs.

Tomi

]]>
By: Tomi Mester https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-191045 Mon, 02 Nov 2020 14:42:39 +0000 https://data36.com/?p=1073#comment-191045 In reply to sriram.

hey Sriram,

it’s a bash command that selects the right columns, removes the lines with NA values and then pushes the whole thing into a new file called sql_ready.csv.
You don’t have to worry about that for now.
But if you want to learn more about it, I recommend my Bash articles, here:
https://data36.com/learn-data-analytics-bash-scratch/

Tomi

]]>
By: Tomi Mester https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-191044 Mon, 02 Nov 2020 14:40:56 +0000 https://data36.com/?p=1073#comment-191044 In reply to Zalan Taller.

hey Zalan,

that means that you have extracted the same file multiple time (hence 2007.csv.5 and not 2007.csv.

You just have to remove the all the 2007.csv files with the rm 2007.csv* command — then rerun the dtrx line and you’ll be fine.

]]>
By: Zalan Taller https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-121429 Wed, 29 Jan 2020 21:09:05 +0000 https://data36.com/?p=1073#comment-121429 Hey Tomi,

I am gettin this warning after trying to unzip the csv file:
dtrx: WARNING: extracting /home/tallizali/2007.csv.bz2 to 2007.csv.5

Do you have any idea what should I do differently?

Thanks a lot in advance!

Zalan

]]>
By: Tomi Mester https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-115183 Sun, 29 Dec 2019 16:15:49 +0000 https://data36.com/?p=1073#comment-115183 In reply to Simeon.

hi Simeon,

thank you so much for drawing my attention to this!
I reached out to their webmaster (and hopefully they will restore the dataset.)

But until then you can use this temporary link instead:
http://46.101.230.157/sql_tutorial/2007.csv.bz2

Cheers,
Tomi

]]>
By: Simeon https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-115052 Sat, 28 Dec 2019 03:28:34 +0000 https://data36.com/?p=1073#comment-115052 Unfortunately it looks like the link you provided to download the data: http://stat-computing.org/dataexpo/2009/2007.csv.bz2 no longer works.

]]>
By: Mariska https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-97201 Thu, 17 Oct 2019 09:30:40 +0000 https://data36.com/?p=1073#comment-97201 In reply to Mariska.

well stupid me forgot to attach my failed code:

select * from flight_delays
where month in (3,5)
and dayofweek in (6,7)
and (airtime > 200 and airtime 250 and airtime < 270);

Then adding the NOT function at the end.

]]>
By: Mariska https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-97200 Thu, 17 Oct 2019 09:27:11 +0000 https://data36.com/?p=1073#comment-97200 Hi Tomi, I play around with my own test , and I have a question. So just answer when you’re comfortably available!

So let’s say I’d like to filter the flight data of
>> Month 3 and 5 wher
>> day of week is only 6 and 7
>> airtime only 200or function – so it failed. The only workaround I managed is to include this at the end:

and month NOT in (1,2,4,6,7,8,9,10,11,12);

So.. despite receiving the result I wanted, I still feel like without the NOT function there should be another solution with just smartly using the and / or function and parentheses.. Would you show me how you’d do this prompt?

]]>
By: sriram https://data36.com/sql-where-clause-tutorial-beginners-ep2/#comment-55869 Wed, 30 Jan 2019 03:03:47 +0000 https://data36.com/?p=1073#comment-55869 Hey Tomi,

Can you explain this line cat 2007.csv |cut -d',' -f1,2,3,4,5,7,10,11,14,15,16,17,18,19 | grep -v ',NA' > sql_ready.csv actually How it works?

]]>