Say we're working with clickstream data:
CREATE TABLE pageviews (
"timestamp" timestamp,
url text,
referrer text,
user_agent text,| 0-mail.com | |
| 007addict.com | |
| 020.co.uk | |
| 027168.com | |
| 0815.ru | |
| 0815.su | |
| 0clickemail.com | |
| 0sg.net | |
| 0wnd.net | |
| 0wnd.org |
| library(tidyverse) | |
| # data from https://www.kaggle.com/bls/american-time-use-survey | |
| df.resp <- read_csv('../data/atus/atusresp.csv') | |
| df.act <- read_csv('../data/atus/atusact.csv', col_types=cols(tustarttim = col_character(), tustoptime = col_character())) | |
| df.sum <- read_csv('../data/atus/atussum.csv') | |
| df.tmp <- df.act %>% | |
| mutate(activity = case_when(trtier2p == 1301 ~ 'Exercise', |
| PostgreSQL Data Types | AWS DMS Data Types | Redshift Data Types | |
|---|---|---|---|
| INTEGER | INT4 | INT4 | |
| SMALLINT | INT2 | INT2 | |
| BIGINT | INT8 | INT8 | |
| NUMERIC (p,s) | If precision is 39 or greater, then use STRING. | If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length) | |
| DECIMAL(P,S) | If precision is 39 or greater, then use STRING. | If the scale is => 0 and =< 37 then: NUMERIC (p,s) If the scale is => 38 and =< 127 then: VARCHAR (Length) | |
| REAL | REAL4 | FLOAT4 | |
| DOUBLE | REAL8 | FLOAT8 | |
| SMALLSERIAL | INT2 | INT2 | |
| SERIAL | INT4 | INT4 |
It's now here, in The Programmer's Compendium. The content is the same as before, but being part of the compendium means that it's actively maintained.