鹦鹉拉肚子吃什么药| 香瓜什么时候成熟| rock什么意思| 扑炎痛又叫什么| 三月七号是什么星座| 徐州有什么好吃的| 梓是什么意思| 李白字什么| 飞五行属什么| 大便稀溏是什么意思| 5月27日什么星座| 下午三点多是什么时辰| 做亲子鉴定需要什么材料| 书签是什么| 大自然的馈赠什么意思| 唯利是图是什么生肖| 牙龈红肿吃什么药| 月经不来什么原因| 4月16什么星座| ap是什么| 造影检查是什么意思| 虚心接受是什么意思| 羊胡子疮用什么药膏| 腿部青筋明显是什么原因| 米线是什么材料做的| 吃什么东西对心脏好| 20至30元什么烟最好抽| 做眉毛有什么危害| 黑管是什么乐器| 知鸟是什么| 槐花蜜是什么颜色| 酸是什么| 什么炖排骨好吃| 办理结婚证需要什么材料| 世界什么名| 民警是干什么的| 吃肠虫清要注意什么| 感情是什么| 肺部挂什么科| 梦见梯子是什么意思| 小姑娘月经推迟不来什么原因| 什么是小苏打| 燕京大学现在叫什么| 笔仙是什么| 啤酒加味精有什么作用| 射手座喜欢什么样的女生| 梦见吵架是什么预兆| 梦到打死蛇是什么意思| 6.15是什么星座| 早上起床想吐是什么原因| 发声是什么意思| 麦麸是什么| 月经过多是什么原因| 痛心疾首的疾什么意思| 印度洋为什么叫印度洋| 是什么意思啊| 胃病是什么原因引起的| 2024什么年属什么年| 为什么来完月经下面痒| 过敏性鼻炎挂什么科室| 梦见发洪水是什么征兆| 贵子是什么意思| 美国为什么那么强大| 黄体破裂是什么原因| 腹泻吃什么食物| 黑玫瑰代表什么意思| 九月24日是什么星座| 跑步后头晕是什么原因| 信必可是什么药| 晒太阳对身体有什么好处| lily是什么花| 小肠气挂什么科| 喉咙痛感冒吃什么药| 辣椒是什么时候传入中国的| ipadair2什么时候上市的| 乐松是什么药| 心脏官能症吃什么药| 肝部有阴影一般都是什么病| 莎莎舞是什么意思| 小腿发胀是什么原因| 离心是什么意思| 吉代表什么生肖| 高血压是什么原因引起的| 一朝一夕是什么意思| 1987年出生属什么生肖| 4月1号什么星座| 256排ct能检查什么病| 喝茶叶有什么好处| blm是什么意思| 银手镯为什么会变黑| 犯了痔疮为什么老放屁| 青少年耳鸣是什么原因引起的| 西安吃什么| 三级残疾是什么程度| 阴道有灼热感是什么原因| 中国美食有什么| 鸭子烧什么好吃| gi是什么意思| 竹子开花意味着什么| 怀孕胎盘低有什么影响| 洗洗睡吧什么意思| 前列腺增生有什么症状| 钾是什么| 口腔有异味是什么原因引起的| 16是什么生肖| 左肝钙化灶是什么意思| 七月开什么花| 前位子宫和后位子宫有什么区别| 右边肋骨下面是什么器官| 脚趾长痣代表什么意思| 车挂件挂什么保平安好| 上海话册那什么意思| 丝瓜不可以和什么一起吃| 天秤男喜欢什么样的女生| 导弹是什么意思| 相思什么意思| 下午5点到7点是什么时辰| 胃溃疡是什么原因引起的| 舌头尖疼吃什么药| 发热出汗是什么原因| 阴虱什么症状| 指甲上有白点是什么原因| qrs是什么意思| gy是什么颜色| 花旗参有什么功效| 内科包括什么| 胡子白了是什么原因| sd什么意思| 什么水果是降火的| 知见是什么意思| 智齿吃什么消炎药| 元气是什么意思| 不放屁是什么原因| 女男是什么字| 蒲公英是什么样子| 什么是简历| 眼睛疲劳用什么眼药水好| 带状疱疹是什么| 什么症状提示月经马上要来了| 牛肉补什么| 冷暖自知上一句是什么| 手淫是什么意思| 天麻长什么样子图片| 月经不调吃什么药效果好| 咳嗽喝什么茶| 巨蟹座幸运花是什么| 避重就轻是什么意思| 紫绀是什么症状| 胎方位roa是什么意思| 1870年是什么朝代| 水落石出开过什么生肖| 吃百香果有什么好处| 检查胸部应该挂什么科| 这是什么虫子| 潘金莲属什么生肖| 六月份出生的是什么星座| 痔疮是什么样的| 眩晕症吃什么好| 铁树开花什么意思| 14k是什么意思| 安痛定又叫什么| 手指关节痛是什么原因| 胃烧灼感是什么原因引起的| 夫妻备孕检查挂什么科| 每个月月经都推迟是什么原因| 西布曲明是什么| 2020年是什么年| alan英文名什么意思| 明油是什么油| 什么习习| 跑得什么| 榴莲不能与什么食物一起吃| 梦到掉牙齿是什么意思| 肉是什么结构| 老人脚肿是什么征兆| 为什么一热身上就痒| 浅卡其色裤子配什么颜色上衣| 750金是什么金| 建设性意见是什么意思| 青石是什么石头| 脚肿什么原因| 纳字五行属什么| 酸奶什么时候喝最好| 唯我独尊是什么意思| 易孕期是什么时候| 看淡一切对什么都没兴趣| 什么人不宜吃石斛| 感觉抑郁了去医院挂什么科| 滚床单什么意思| 六合是什么意思| 鸡项是什么鸡| 按摩有什么好处| 护士资格证有什么用| 两个o型血能生出什么血型的孩子| 屎是什么味道| cvc是什么| 9月13日是什么纪念日| 9月15号是什么日子| 头皮屑多用什么洗发水效果好| 什么叫有机| 惊蛰是什么季节| 时蔬是什么意思| 孕妇钙片什么时间段吃最好| 五台山是求什么的| 院士是什么级别| 免疫组织化学染色诊断是什么| 乌江鱼是什么鱼| 用什么能把牙齿洗白| 长期熬夜会有什么后果| 滨海新区有什么好玩的地方| 大肠埃希菌是什么病| 临床医学是什么意思| 眼睛很多眼屎是什么原因| 儿童个子矮小看什么科| 人做梦是什么原因| 护发素什么牌子好| b型血的人是什么性格| 放屁特别臭是什么原因| 腺癌是什么原因引起的| 盐冻虾是什么意思| 六味地黄丸有什么副作用| 马铃薯是什么| 什么情况下要做肠镜检查| 身体肿是什么原因引起的| 石榴花是什么颜色| 什么是水痘| 扁桃体发炎引起的发烧吃什么药| 床垫什么材质的好| 山竹什么味道| 小孩经常肚子疼是什么原因| 出现幻觉幻听是什么心理疾病| 没脑子是什么意思| 噗噗噗是什么意思| 氟康唑治什么妇科炎症| 牙齿一吸就出血是什么原因| 腺体增生是什么意思| 梦见蜂蜜是什么意思| naomi什么意思| 蝾螈是什么动物| 有过之而不及什么意思| 做宫腔镜检查需要提前做什么准备| 子宫腺肌症有什么症状| 眼痒痒是什么原因引起| 吃头孢不能吃什么水果| c肽测定是什么意思| 地米是什么药| 骨头受伤了吃什么恢复的快| 做梦梦见狗咬我什么意思啊| 乳酸杆菌大量是什么意思| 广东省省长什么级别| 7月份什么星座| 肌肉萎缩有什么症状| 相知是什么意思| 耿耿于怀什么意思| 大疱性皮肤病是什么病| 同化什么意思| 天时地利人和是什么意思| 太阳像什么| 蛇的尾巴有什么作用| 含量是什么意思| 肝脏损伤会出现什么症状| 录取通知书是什么生肖| poison是什么意思| 痰湿吃什么中成药| 百度
Skip to content

Accompaniment to nowcasting benchmark paper, illustrating how to estimate each of the methods examined in either R or Python.

License

Notifications You must be signed in to change notification settings

dhopp1/nowcasting_benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

?

History

35 Commits
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

Repository files navigation

nowcasting_benchmark

This repository is an accompaniment to an article (available here or here) benchmarking common nowcasting and machine learning methodologies. It illustrates how to estimate each of the methods examined in the analysis in either R or Python. 17 methodologies were tested in nowcasting quarterly US GDP using data from the Federal Reserve of Economic Data (FRED). The variables chosen were those specified in Bok, et al (2018). The methodologies were tested on a period dating from Q1 2002 to Q3 2022.

In applied nowcasting exercises, ideally, several methodologies should be employed and their results compared empirically for final model selection. In practice, this is difficult due to the fragmented landscape of different nowcasting methodology frameworks and implementations. This repository aims to make things significantly easier by giving fully runnable boilerplate code in R or Python for each methodology examined in this benchmarking analysis. The methodologies/ directory contains self-contained Jupyter notebooks illustrating how each methodology can be run in the nowcasting context with an example using data from FRED and testing from 2005 to 2010. This was to reduce runtime for illustration, users can select their own testing periods if so desired.

All the notebooks assume initial input data is in the format of seasonally adjusted growth rates in the highest frequency of the data (monthly in this case), with a date colum in the beginning and a separate column for each variable. Lower frequency data should have their values listed in the final month of that period (e.g. December for yearly data, March, June, September, or December for quarterly data), with no data / NAs in the intervening periods or for missing data at the end of series. An example is below.

Also necessary is a metadata CSV listing the name of the series/column and its frequency. Once these two conditions are met, it should be possible to run any of the methodologies on your own dataset, with adjustments as needed using the Notebooks as a guide. Below is a short overview of each of the methodologies, followed by graphical results of the full benchmarking analysis showing predictions on different data vintages.

Recommendation for a single methodology

If you only have bandwidth or interest to try out one methodology, the LSTM is recommended. It is accessible, available in four different programming languages (both R and Python notebooks are included in the methodologies/ directory), and straightforward to estimate and generate predictions given the data format stipulated above. It has shown strong predictive performance in relation to the other methodologies, including during shock conditions, and will not throw estimation errors on certain data. It does, however, have hyperparameters that may need to be tuned if initial performance is not good. This process can be done from within the nowcast_lstm library via the hyperparameter_tuning function. It is also the only implementation with built-in functionality for variable selection. In this analysis, input variables were taken as given, but often this is not the case. The variable_selection function will select best-performing variables from a pool of candidate variables. Both hyperparameter tuning and variable selection can also be performed together with the select_model function. See the documentation for the nowcast_lstm library for more information.

Methodologies

  • ARMA (model_arma.ipynb):
    • background: Wikipedia
    • language, library, and function: Python,ARIMA function of pmdarima library
    • commentary: Univariate benchmark model. Acceptable performance in normal/non-volatile times, extremely limited use during any shock periods. Potential use as another way to fill "ragged-edge" missing data for component series in other methodologies, as opposed to mean-filling.
  • Bayesian mixed-frequency vector autoregression (model_bvar.ipynb):
    • background: Wikipedia, ECB working paper
    • language, library, and function: R, estimate_mfbvar function of mfbvar library
    • commentary: Difficult to get data into proper format for the function to estimate properly, making dynamic/programmatic changing and selection of variables and overall usage hard, but doable. Very performant methodology in this benchmarking analysis, ranking second-best in terms of RMSE and best in terms of MAE. However, predictions were very volatile, with highest month-to-month revisions in predictions on average. Also may produce occasional large outlier predictions or fail to estimate on a dataset due to convergence, etc., issues. This library/implementation cannot handle yearly variables.
  • Decision tree (model_dt.ipynb):
    • background: Wikipedia
    • language, library, and function: Python,DecisionTreeRegressor function of sklearn library
    • commentary: Simple methodology, not traditionally used in nowcasting. Doesn't handle time series, handled via including additional variables for lags. Poor performance in this benchmarking analysis. The four tree-based methodologies in this analysis, decision trees, random forest, and gradient boost, learn most of their information from the latest available data, so have difficulties predicting things other than the mean in early data vintages. See model_gb.ipynb for a means of addressing this. All three also have difficulties predicting values more extreme than any they have seen before, limiting their use in shock periods, e.g. during the COVID crisis. Has hyperparameters which may need to be tuned.
  • DeepVAR (model_deepvar.ipynb):
    • background: Working paper
    • language, library, and function: Python, DeepVAREstimator function of GluonTS library
    • commentary: Originally developed as a forecasting tool with the business context in mind. Generates probabilistic forecasts via an autoregressive recurrent neural network (RNN). Its performance in this analysis was poor, akin to that of an improved ARMA model. Depending on hyperparameters, it is slow to estimate compared with the other methodologies in this analysis. Its syntax is also complicated to get working for the nowcasting context.
  • Dynamic factor model (DFM) (model_dfm.ipynb):
    • background: Wikipedia, FRB NY paper, UNCTAD research paper
    • language, library, and function: R, dfm function of nowcastDFM library
    • commentary: De facto standard in nowcasting, very commonly used. Middling performance in this analysis. May require assigning variables to different "blocks" or groups, which can be an added complication. In this benchmarking analysis, the DFM without blocks (equivalent to one "global" blocks/factor) performed worse than the model with the blocks specified by the FED. The model also fails to estimate on many datasets due to uninvertibility of matrices. Estimation may also take a long time depending on convergence of the expectation-maximization algorithm. Estimating models with more than 20 variables can be very slow. This library/implementation cannot handle yearly variables.
  • Elastic net (model_elasticnet.ipynb):
    • background: Wikipedia
    • language, library, and function: Python, ElasticNet function of sklearn library
    • commentary: OLS with introduction of L1 and L2 regularization penalty terms. Can potentially help with multicollinearity issues of OLS in the nowcasting context. Performance is expectedly better than that of OLS in this benchmarking analysis, with less volatile predictions. Overall it was the best performer amongst methodologies that do not natively handle time series. Introduces the Lasso alpha hyperparameter and L1 ratio, which need to be tuned.
  • Gradient boosted trees (model_gb.ipynb):
    • background: Wikipedia
    • language, library, and function: Python,GradientBoostingRegressor function of sklearn library
    • commentary: Very performant model in traditional machine learning applications. Doesn't handle time series, handled via including additional variables for lags. Poor performance in this benchmarking analysis. However, performance can be substantially improved by training separate models for different data vintages, details in model_gb.ipynb example file. This method can be applied to any of the methodologies that don't handle time series (OLS, random forest, etc.), but it had the biggest positive impact in this benchmarking analysis for gradient boosted trees. Has hyperparameters which may need to be tuned.
  • Lasso (model_lasso.ipynb):
    • background: Wikipedia
    • language, library, and function: Python, Lasso function of sklearn library
    • commentary: OLS with introduction of L1 regularization penalty term. Can potentially help with multicollinearity issues of OLS in the nowcasting context. Performance is expectedly better than that of OLS in this benchmarking analysis, with less volatile predictions. Overall it was better than ridge regression, but worse than elastic net. Introduces the Lasso alpha hyperparameter which needs to be tuned.
  • Long short-term memory neural network (LSTM) (model_lstm.ipynb):
    • background: Wikipedia, first article, second article
    • language, library, and function: Python or R, LSTM function of nowcast_lstm library. Also available in Python, R, MATLAB, and Julia.
    • commentary: Very performant model, best performer in terms of RMSE, second-best in terms of MAE. Able to handle any frequency of data in either target or explanatory variables, easiest data setup process of any implementation in this benchmarking analysis. Couples high predictive performance with relatively low volatility, e.g. in contrast with Bayesian VAR, which also has good predictive performance, but is quite volatile. Can handle an arbitrarily large number of input variables without affecting estimation time and can be estimated on any dataset without error. Has hyperparameters which may need to be tuned.
  • Mixed-frequency vector autoregression (MF-VAR) (model_var.ipynb):
    • background: Wikipedia, Minneapolis Fed paper
    • language, library, and function: Python,VAR function of PyFlux library
    • commentary: Has been used in nowcasting. Middling performance in this benchmarking analysis. The PyFlux implementation can be difficult to get working and may not run on versions of Python > 3.5.
  • Mixed data sampling regression (MIDAS) (model_midas.ipynb):
    • background: Wikipedia, paper
    • language, library, and function: R, midas_r function of midasr library
    • commentary: Has been used in nowcasting, solid performance in this benchmarking analysis. Difficult data set up process to estimate and get predictions.
  • Midasml (model_midasml.ipynb):
    • background: Working paper
    • language, library, and function: R, cv.sglfit function of midasml library
    • commentary: Relatively new methodology that builds on MIDAS models by introducing LASSO regularization. Solid performance (third-best) in this analysis. Relatively difficult syntax to get working in the nowcasting context.
  • Multilayer perceptron (feedforward) artificial neural network (model_mlp.ipynb):
    • background: Wikipedia, paper
    • language, library, and function: Python, MLPRegressor function of sklearn library
    • commentary: Has been used in nowcasting, decent performance in this benchmarking analysis. Doesn't handle time series, handled via including additional variables for lags. Has hyperparameteres which may need to be tuned.
  • Ordinary least squares regression (OLS) (model_ols_ridge.ipynb):
    • background: Wikipedia
    • language, library, and function: Python, LinearRegression function of sklearn library
    • commentary: Extremely popular approach to regression problems. Doesn't handle time series, handled via including additional variables for lags. Middling performance in this benchmarking analysis and very volatile, will also probably suffer from multicollinearity if many variables are included.
  • Random forest (model_rf.ipynb):
    • background: Wikipedia
    • language, library, and function: Python,RandomForestRegressor function of sklearn library
    • commentary: Popular methodology in classical machine learning, combining the predictions of many random decision trees. Doesn't handle time series, handled via including additional variables for lags. Poor performance in this benchmarking analysis. Has hyperparameters which may need to be tuned.
  • Ridge regression (model_ridge.ipynb):
    • background: Wikipedia
    • language, library, and function: Python, RidgeRegression function of sklearn library
    • commentary: OLS with introduction of L2 regularization penalty term. Can potentially help with multicollinearity issues of OLS in the nowcasting context. Performance is expectedly slightly better than that of OLS in this benchmarking analysis, with less volatile predictions. Introduces the ridge alpha hyperparameter which needs to be tuned.
  • XGBoost (model_xgboost.ipynb):
    • background: Wikipedia
    • language, library, and function: Python,XGBRegressor function of sklearn library
    • commentary: Similar methodolgy to gradient boost with added regularization and implementation tweaks. Performance is very similar to that of gradient boost.

Scatter plots

These plots show MAE and RMSE plotted against average revision (i.e., volatility). The ideal model would be in the lower left corner (low error and low volatility).

Graphical results

Thee plots show actuals and nowcasts at different time vintages for each methodology. Ordered alphabetically.

About

Accompaniment to nowcasting benchmark paper, illustrating how to estimate each of the methods examined in either R or Python.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
油脂旺盛是什么原因 宫寒吃什么药 势在必得是什么意思 女人肾虚吃什么好得快 4级手术是什么意思
心病科主要看什么病 11月4号是什么星座 p5是什么意思 qty什么意思 鼻窦炎吃什么抗生素
洗银首饰用什么清洗 红糖水什么时候喝 脾虚痰湿吃什么中成药 易烊千玺什么星座 桂林山水甲天下是什么意思
千古一帝指什么生肖 中央党校校长是什么级别 腿脚浮肿是什么原因引起的 肠道长息肉是什么原因造成的 梦见小男孩拉屎是什么意思
蜈蚣吃什么食物hcv8jop5ns0r.cn 排酸对身体有什么好处hcv9jop6ns6r.cn 孩子鼻子流鼻血是什么原因hcv9jop7ns2r.cn 腹泻可以吃什么hcv9jop0ns6r.cn 望穿秋水是什么意思hcv9jop3ns6r.cn
孔雀为什么会开屏fenrenren.com 一什么野花xianpinbao.com 牙齿痛什么原因hcv8jop4ns4r.cn 狐臭手术挂什么科室aiwuzhiyu.com 肚皮冰凉是什么原因呢travellingsim.com
什么病可以办低保hcv7jop7ns2r.cn 石英表是什么意思hcv7jop9ns7r.cn 头发需要什么营养hcv9jop2ns7r.cn 抗角蛋白抗体阳性是什么意思hcv8jop4ns3r.cn aug什么意思hcv8jop7ns1r.cn
吃什么治疗便秘hcv8jop8ns4r.cn 什么是精神分裂hcv9jop4ns9r.cn 耳石症眩晕吃什么药hcv9jop5ns0r.cn 应届是什么意思hcv7jop4ns7r.cn 五点到七点是什么时辰yanzhenzixun.com
百度