--- title: "statar" author: "Matthieu Gomez" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Panel data} %\VignetteEngine{knitr::rmarkdown} %\usepackage[utf8]{inputenc} --- ### Elapsed dates The classes "monthly" and "quarterly" print as dates and are compatible with usual time extraction (ie `month`, `year`, etc). Yet, they are stored as integers representing the number of elapsed periods since 1970/01/0 (resp in week, months, quarters). This is particularly handy for simple algebra: ```R # elapsed dates library(lubridate) date <- mdy(c("04/03/1992", "01/04/1992", "03/15/1992")) datem <- as.monthly(date) # displays as a period datem #> [1] "1992m04" "1992m01" "1992m03" # behaves as an integer for numerical operations: datem + 1 #> [1] "1992m05" "1992m02" "1992m04" # behaves as a date for period extractions: year(datem) #> [1] 1992 1992 1992 ``` ### lag / lead `tlag`/`tlead` a vector with respect to a number of periods, **not** with respect to the number of rows ```R year <- c(1989, 1991, 1992) value <- c(4.1, 4.5, 3.3) tlag(value, 1, time = year) library(lubridate) date <- mdy(c("01/04/1992", "03/15/1992", "04/03/1992")) datem <- as.monthly(date) value <- c(4.1, 4.5, 3.3) tlag(value, time = datem) ``` In constrast to comparable functions in `zoo` and `xts`, these functions can be applied to any vector and be used within a `dplyr` chain: ```R df <- tibble( id = c(1, 1, 1, 2, 2), year = c(1989, 1991, 1992, 1991, 1992), value = c(4.1, 4.5, 3.3, 3.2, 5.2) ) df %>% group_by(id) %>% mutate(value_l = tlag(value, time = year)) ``` ### is.panel `is.panel` checks whether a dataset is a panel i.e. the time variable is never missing and the combinations (id, time) are unique. ```R df <- tibble( id1 = c(1, 1, 1, 2, 2), id2 = 1:5, year = c(1991, 1993, NA, 1992, 1992), value = c(4.1, 4.5, 3.3, 3.2, 5.2) ) df %>% group_by(id1) %>% is.panel(year) df1 <- df %>% filter(!is.na(year)) df1 %>% is.panel(year) df1 %>% group_by(id1) %>% is.panel(year) df1 %>% group_by(id1, id2) %>% is.panel(year) ``` ### fill_gap fill_gap transforms a unbalanced panel into a balanced panel. It corresponds to the stata command `tsfill`. Missing observations are added as rows with missing values. ```R df <- tibble( id = c(1, 1, 1, 2), datem = as.monthly(mdy(c("04/03/1992", "01/04/1992", "03/15/1992", "05/11/1992"))), value = c(4.1, 4.5, 3.3, 3.2) ) df %>% group_by(id) %>% fill_gap(datem) df %>% group_by(id) %>% fill_gap(datem, full = TRUE) df %>% group_by(id) %>% fill_gap(datem, roll = "nearest") ```