27.07.2025

Forschungsdaten Infrastruktur
OLAP and OLTP
wigle WLAN stats
neogeography

13.07.2025

DNS sinkhole

04.07.2025

pragmaticengineer - Gergely Orosz

01.07.2025

Informationstheorie
Claude Shannon
Prism syntax highlighting

14.06.2025

mergely

08.06.2025

MCP Model Context Protocol

25.05.2025

MySQL & MariaDB creator: Finn Michael "Monty" Widenius
InnoDB storage engine

19.05.2025

php creator Rasmus Lerdorf
Transparent Data Encryption (TDE)
MySQL Keyring
MySQL-Port-Protokoll (standard port: 3306) vs. MySQL-X-Protokoll port (standard port: 33060)
LFTP
Supabase Postgres is only for lightweight use cases with ETL/ELT Pipelines

10.05.2025

Tesserart OCR

08.05.2025

Microsoft Data Platform - Tools (SSIS, SSAS, SSRS)

17.04.2025

Atom Publishing Protocol
PubSubHubBub
Burn identity
Dublin Core

14.04.2025

Postgres libpq — C Library
Postgres Environment Variables

13.04.2025

psycopg3
Set LANG=de_DE.ISO-8859-1 on Arch Linux
Virtual environment Python
Flask - lightweight Web Server Gateway Interface
Flask SQLalchemy
package manager delivery

10.04.2025

PL/SQL-Oracle's extension of SQL

09.04.2025

Data Poisining
taxonomy of LLM attacks
Character encoding
MCP-Model Context Protocol

04.04.2025

Python Designer
static code analysis with PyLint
Javascript linter ESlint

30.03.2025

Scrapping Infoboxes idea
BeautifulSoup
Pandas alternative Polars
BeautifulSoup creator: Leonard Richardson
Pandas creator: Wes McKinney
Polars creator: Ritchie Vink

16.02.2025

Kimball's Dimension Data Modelling.
Ralph Kimball, bottom-up methodology to data warehousing
Data Warehousing
Bill Inmon, top-down approach to data warehousing
Data Warehouse-Konzepte: Kimball vs. Inmon-Ansatz.
A dbt project informs dbt about the context of your project and how to transform your data (build your data sets). dbt_project.yml
https://docs.getdbt.com/reference/source-properties
dbt Source Freshness
freshness: A freshness block is used to define the acceptable amount of time between the most recent record, and now, for a table to be considered "fresh".

11.02.2025

The SQL server CONCAT() function adds two or more strings together.
The SQL server CAST() function Convert a value to an int datatype.
SQL GROUP BY Statement is the workhorse of all analytics.

09.02.2025

sqlalchemy
pandas-read-csv-low-memory-and-dtype-options
pgadmin
YAML
Microsoft Sentinel

02.02.2025

docker network

01.02.2025

psycopg2
ipython
Jupyter Notebook (Formerly known as the IPython Notebook)

28.1.2025

grep-command
DockerDeamon

24.1.2025

podman

20.1.2025

Parquet Format
Apache Parquet
Delta Lake
DeltaMaster

19.1.2025

Conda | environment manager
bpython | interpreter
pgcli | cli for Postgres
pip

15.1.2025

Docker Flag -t --tag stringArray Name and optionally a tag (format:"name:tag")
Docker Flag -it is short for --interactive + --tty (pseudoterminal)
Docker CLI
Docker cheatsheet

14.1.2025

MINGW
DE 2025 Zoomcamp
WSL2
Pandas

26.11.2024

datatalks course
fivetran
hevodata
Data load tool (dlt)
observation of data assets
ingestion framework

25.11.2024

API Catalog

22.11.2024

puppeteer
selenium
playwright

09.02.2021

dynamic finance models
gcp pub sub code

04.02.2021

Building Batch Data Pipelines on GCP
wording roster colon Dienstplan

03.02.2021

directRunner colon runs pipeline locally
DataFlowRunner colon runs pipeline in the cloud
Fixme

02.02.2021

Integrated Development Environment
MapReduce in Java

01.02.2021

QL in location and zone
aggregating with GroupByKey or CoGroupByKey
wording aggregate colon Summe
python reading

31.01.2021

Java neo4j demo
Compare Java vs. NodeJS

30.01.2021

29.01.2021

cloud composer architecture
GA colon google analytics
GA colon general availability
Apache Airflow
ETL workflow patterns colon push(event triggered) or pull(scheduled workflow run)
PTransform colon parallel transform
PColletions colon parallel collection

28.01.2021

DAG colon directed acyclic graph
Wrangler

27.01.2021

miro whiteboard
wording Wrangler colon Viehtreiber, Streithahn
dataflow
github actions

26.01.2021

external client certification

25.01.2021

RPA - robotic process automation
Batch Programming
Data Processing
pCollection

24.01.2021

juniper networks
Anaconda

23.01.2021

wording truncated colon gekürzt
YARN colon Yet Another Resource Negotiator (AKA MapReduce 2.0)
systemd
launchd

22.01.2021

Deeper into the Elephant
wording mitigate colon mildern
wording federated query colon Verbundsabfrage
ISON 8601
YARN package manager
YAML Ain’t Markup Language
Zeppelin notebook

21.01.2021

read–eval–print loop (REPL) colon interactive toplevel or language shell
ETL colon DataFlow (BigQuery), Dataproc (Spark), DataFusion (GUI)
In declarative programming, you tell the system what you want and it figures out how to actually do the implementation.

20.01.2021

docker image multiple tags using gcloud build submit can be pushed with --all-tags option (every docker client is supporting this feature)
docker build -t
gcr.io/${_GCP_PROJECT}/$app:${SHORT_SHA} -t
gcr.io/${_GCP_PROJECT}/$app:latest

docker push gcr.io/${_GCP_PROJECT}/$app --all-tags result creates tags colon SHORT_SHA (example colon 428a687), latest

19.01.2021

wording thoroughly colon Gruendlich
data shape colon Dataform
data skew colon Datenschiefe
data skew can be positive or negative
example data skew can follow Zipfian, Gaussian, Poisson

18.01.2021

rowspan
colspan
bgcolor obsolete in HTML5

17.01.2021

BigQuery partitionig on date
use UNNEST command to get array field
statoshi smallest bitcoin-unit
in bq is , colon a CROSS JOIN< and allow to query structs
slot time colon compute power distributed in parallels behind the scenes and the byte shuffled which is those VMs behind-the-scenes talking to each other
Dremel colon massively parallel SQL engine that powers BigQuery.
BigQuery is column based storage not record based
struct is a logical container and can be used instead a table
BigQuery struct colon pre-joined table within a table
Schemas
Schema design colon Nested and Repeated Fields

16.01.2021

inline HTML

15.01.2021

ERP Visual Studio

14.01.2021

margin, padding, border
td colon table data
tr colon table row

13.01.2021

Dining philosophers problem

12.01.2021

Schemas

11.01.2021

data exfiltration constraints

10.01.2021

online transaction processing colon OLTP
online analytival processing colon OLAP

09.01.2021

non-breaking Space & nbsp
CI/CD colon Continuous_integration and continuous deployment or continuous delivery
k8s colon Kubernetes an open-source container-orchestration system
HTTP-over-QUIC protocol colon HTTP/3, new HTTP standard

08.01.2021

DISM - a command-line tool that can be used to service and prepare Windows images
BigQuery corolation between 2 datasets from different locations.

07.01.2021

clutch replacement

06.01.2021

Datalake vs. DataWarehouse
clutch replacement

05.01.2021

LuaRocks - the lua package manager
Corona - 2d game engine

04.01.2021

ingest data from google sheet into bq as external table
nice colon in bq query editor go into the schema and click the columns
nice colon format query in bq query editor
RawData -replicate-> Data Lake -ETLpipeline-> Data Warehouse
GitHub 816 repositories at Google Cloud Platform ccount

03.01.2021

Data Lake vs. Data Warehouse slides

02.01.2021

vocabulary conclude colon infer
Keras colon the Python deep learning API
Numpy colon a high-performance scientific computing package
Scipy colon an open source package for science and engineering that uses numpy
pandas colon an open source package for working with tabular data
Scikit Learn colon an open source machine learning package
TensorFlow colon an open source machine learning package for deep learning
Bitbucket from Atlassian
NYC OpenData

01.01.2021

Distsys colon Distributed Systems
Distsys law #1 colon you can pick "at least once" or "at most once", never "exactly once".
Most queue systems default to "at least once".
DeDupe colon Data deduplication
expressjs colon fast, unopinionated, minimalist web framework for Node.js
npm colon Node Package Manager
cisco spark equal cisco webex teams
AutoML API's for unstructured data

31.12.2020

planning clutch replacement 9N3

30.12.2020

GCP cold start prevention
GCP Cloud Functions
Logs Missing in Stack driver during high loads, use Grafana Promtail or Grafana Loki instead GKE logging
Graceful shutdowns on Cloud Run
data engineering DE jump point implementation

29.12.2020

Pub/Sub pipeline

28.12.2020

BigQuery struct
A view is a stored query.
longitude latitude
geoViz for google GIS (Geographic information system)
experimentation process colon "hyperparameter tuning"
predict

27.12.2020

Emmet Sublime HTML plugin
Grunt command line tool
mailgun email API

26.12.2020

Qwiklabs Configure the load balancing service
HTTP load balancer / Network load balancer
HTTP(S) Load Balancing is implemented on Google Front End (GFE)
spokuj gets a commands collection
Qwiklabs colon achieved First Google Cloud Essentials Batch

25.12.2020

diary correction

24.12.2020

NVM Node version manager
BigQuery SQL dialects
Hadoop on-premise thing in scaling
Preemtible VMs are fault tolerant, 80% cheaper and useful for batch jobs
faulty recommendations ML using Cloud SQL and Spark

23.12.2020

Qwiklabs Creating a Cluster on Qwiklap "Kubernetes Engine: Qwik Start"

22.12.2020

Qwiklabs Creating a Virtual Machine
Github repo email reply

21.12.2020

Qwiklabs Creating a Virtual Machine
pfile colon azure information proctection viewer

19.12.2020

coursera Data engineer
news colon Rentenportal

18.12.2020

HTML clean code

17.12.2020

coursera Access

09.12.2020

latin colon Sapare aude / Dare to know
GCP QwikLabs

08.12.2020

GCP QwikLabs

07.12.2020

GCP Data Pipelines

06.12.2020

Nikolaus Geschenk colon honey
spokuj Jubiläum

05.12.2020

wording: Retrospektive / Verteilter Rückblick iX "09/20"
Tool: Parabol
JSON Testing at test.hml

04.12.2020

wording colon Feedback possibility

03.12.2020

App colon Black Duck