A Jupiter Notebook is an external tool and is something that data scientists tend to use because it allows you to mix and match documentation along with code.
Notice that the Notebook contains some text documentation that says what's happening. It's a report but it's also executable. So you can share a notebook with someone and they can run the code. So it's executable, it also serves as the documentation of the exact work that the data scientists did.
datafile tpye: *.ipynb
ML Engine
It can be part of the creation process developing an own model in Keras.
Jupyter comes with what are called Jupyter Magics and we're to go ahead and run BigQuery.
%%load_ext google.cloud.bigquery
Jupyter Magics allows to run BigQuery and gets the results in the notebook.
%%bigquery --project $PROJECT
SELECT
url, title, score
FROM
`bigquery-public-data.hacker_news.stories`
WHERE
LENGTH(title) > 10
AND score > 10
AND LENGTH(url) > 0
LIMIT 10
Create a training DataFrame and a evaluation DataFrame with Pandas with the difference that 75% of the data is used for training and 25% for evaluation.
traindf = bg.query(query + " AND MOB(ABS(FARM_FINGERPRINT(title)),4 > 0").to_dataframe())
evaldf = bg.query(query + " AND MOB(ABS(FARM_FINGERPRINT(title)),4 = 0").to_dataframe())
Figure our how many training files we have
traindf['source'].value_counts()
Figure our how many files are in the evaluation set
evaldf['source'].value_counts()
Jupyter magic function to redirect matplotlib to render a graphic figure inline in the notebook instead of just dumping the data into a variable.
%matplotlib inline
ax = attack_stats.toPandas().plot.bar(x='protocol_type', subplots=True, figsize=(10,25))