Demographic Characterization of Anonymous Trace Travel Data

Author(s):  
Joshua Auld ◽  
Abolfazl (Kouros) Mohammadian ◽  
Marcelo Simas Oliveira ◽  
Jean Wolf ◽  
William Bachman

Research was undertaken to determine whether demographic characteristics of individual travelers could be derived from travel pattern information when no information about the individual was available. This question is relevant in the context of anonymously collected travel information, such as cell phone traces, when used for travel demand modeling. Determining the demographics of a traveler from such data could partially obviate the need for large-scale collection of travel survey data, depending on the purpose for which the data were to be used. This research complements methodologies used to identify activity stops, purposes, and mode types from raw trace data and presumes that such methods exist and are available. The paper documents the development of procedures for taking raw activity streams estimated from GPS trace data and converting these into activity travel pattern characteristics that are then combined with basic land use information and used to estimate various models of demographic characteristics. The work status, education level, age, and license possession of individuals and the presence of children in their households were all estimated successfully with substantial increases in performance versus null model expectations for both training and test data sets. The gender, household size, and number of vehicles proved more difficult to estimate, and performance was lower on the test data set; these aspects indicate overfitting in these models. Overall, the demographic models appear to have potential for characterizing anonymous data streams, which could extend the usability and applicability of such data sources to the travel demand context.

Author(s):  
Ryosuke Abe ◽  
Kay W. Axhausen

This study estimates the impact of major road supply on individual travel time expenditures (TTEs) using data that cover 30-year variations in transportation infrastructure and travel behavior. The impacts of the supply of road and rail infrastructure are estimated with a data set that combines records of large-scale household travel surveys in the Tokyo metropolitan area conducted in 1978, 1988, 1998, and 2008. Linear and Tobit models of individual TTEs are estimated by following the behavior of birth cohorts over the 30-year period. The models incorporate the changes in transportation infrastructure, measured as lane kilometers of two levels of major road stock and vehicle kilometers of urban rail service. The results show significant negative effects of lane kilometers for higher-level and lower-level major roads on the TTEs for all travel purposes and for commuting, after controlling for socioeconomic backgrounds and generations of individuals. This study discusses that, in Tokyo, the estimated effect is more likely to reflect the effect of a major road network per se on individual TTEs than the (indirect) effect of major road supply on individual TTEs working through land development activities (i.e., induced car travel demand). For example, the caveat is that actual road investment decisions still need to consider the induced component of road traffic in addition to the (direct) effect that is estimated in this study.


Data in Brief ◽  
2018 ◽  
Vol 17 ◽  
pp. 267-274 ◽  
Author(s):  
Matej Cebecauer ◽  
Ľuboš Buzna

2019 ◽  
Vol 8 (6) ◽  
pp. 257 ◽  
Author(s):  
Huihui Wang ◽  
Hong Huang ◽  
Xiaoyong Ni ◽  
Weihua Zeng

Mobility and spatial interaction data have become increasingly available due to the widespread adoption of location-aware technologies. Examples of mobile data include human daily activities, vehicle trajectories, and animal movements. In this study we focus on a special type of mobility data, i.e., origin–destination (OD) pairs, and propose a new adapted chord diagram plot to reveal the urban human travel spatial-temporal characteristics and patterns of a seven-day taxi trajectory data set collected in Beijing; this large scale data set includes approximately 88.5 million trips of anonymous customers. The spatial distribution patterns of the pick-up points (PUPs) and the drop-off points (DOPs) on weekdays and weekends are analyzed first. The maximum of the morning and the evening peaks are at 8:00–10:00 and 17:00–19:00. The morning peaks of taxis are delayed by 0.5–1 h compared with the commuting morning peaks. Second, travel demand, intensity, time, and distance on weekdays and weekends are analyzed to explore human mobility. The travel demand and high-intensity travel of residents in Beijing is mainly concentrated within the 6th Ring Road. The residents who travel long distances (>10 km) and for a long time (>60 min) mainly from outside the 6th Ring Road and the surrounding new towns of Beijing. The circular structure of the travel distance distribution also confirms the single-center urban structure of Beijing. Finally, a new adapted chord diagram plot is proposed to achieve the spatial-temporal scale visualization of taxi trajectory origin–destination (OD) flows. The method can characterize the volume, direction, and properties of OD flows in multiple spatial-temporal scales; it is implemented using a circular visualization package in R (circlize). Through the visualization experiment of taxi GPS trajectory data in Beijing, the results show that the proposed visualization technology is able to characterize the spatial-temporal patterns of trajectory OD flows in multiple spatial-temporal scales. These results are expected to enhance current urban mobility research and suggest some interesting avenues for future research.


2009 ◽  
Vol 28 (11) ◽  
pp. 2737-2740
Author(s):  
Xiao ZHANG ◽  
Shan WANG ◽  
Na LIAN

Author(s):  
Eun-Young Mun ◽  
Anne E. Ray

Integrative data analysis (IDA) is a promising new approach in psychological research and has been well received in the field of alcohol research. This chapter provides a larger unifying research synthesis framework for IDA. Major advantages of IDA of individual participant-level data include better and more flexible ways to examine subgroups, model complex relationships, deal with methodological and clinical heterogeneity, and examine infrequently occurring behaviors. However, between-study heterogeneity in measures, designs, and samples and systematic study-level missing data are significant barriers to IDA and, more broadly, to large-scale research synthesis. Based on the authors’ experience working on the Project INTEGRATE data set, which combined individual participant-level data from 24 independent college brief alcohol intervention studies, it is also recognized that IDA investigations require a wide range of expertise and considerable resources and that some minimum standards for reporting IDA studies may be needed to improve transparency and quality of evidence.


2020 ◽  
Vol 47 (3) ◽  
pp. 547-560 ◽  
Author(s):  
Darush Yazdanfar ◽  
Peter Öhman

PurposeThe purpose of this study is to empirically investigate determinants of financial distress among small and medium-sized enterprises (SMEs) during the global financial crisis and post-crisis periods.Design/methodology/approachSeveral statistical methods, including multiple binary logistic regression, were used to analyse a longitudinal cross-sectional panel data set of 3,865 Swedish SMEs operating in five industries over the 2008–2015 period.FindingsThe results suggest that financial distress is influenced by macroeconomic conditions (i.e. the global financial crisis) and, in particular, by various firm-specific characteristics (i.e. performance, financial leverage and financial distress in previous year). However, firm size and industry affiliation have no significant relationship with financial distress.Research limitationsDue to data availability, this study is limited to a sample of Swedish SMEs in five industries covering eight years. Further research could examine the generalizability of these findings by investigating other firms operating in other industries and other countries.Originality/valueThis study is the first to examine determinants of financial distress among SMEs operating in Sweden using data from a large-scale longitudinal cross-sectional database.


2003 ◽  
Vol 42 (05) ◽  
pp. 564-571 ◽  
Author(s):  
M. Schumacher ◽  
E. Graf ◽  
T. Gerds

Summary Objectives: A lack of generally applicable tools for the assessment of predictions for survival data has to be recognized. Prediction error curves based on the Brier score that have been suggested as a sensible approach are illustrated by means of a case study. Methods: The concept of predictions made in terms of conditional survival probabilities given the patient’s covariates is introduced. Such predictions are derived from various statistical models for survival data including artificial neural networks. The idea of how the prediction error of a prognostic classification scheme can be followed over time is illustrated with the data of two studies on the prognosis of node positive breast cancer patients, one of them serving as an independent test data set. Results and Conclusions: The Brier score as a function of time is shown to be a valuable tool for assessing the predictive performance of prognostic classification schemes for survival data incorporating censored observations. Comparison with the prediction based on the pooled Kaplan Meier estimator yields a benchmark value for any classification scheme incorporating patient’s covariate measurements. The problem of an overoptimistic assessment of prediction error caused by data-driven modelling as it is, for example, done with artificial neural nets can be circumvented by an assessment in an independent test data set.


2020 ◽  
Vol 72 (1) ◽  
Author(s):  
Chao Xiong ◽  
Claudia Stolle ◽  
Patrick Alken ◽  
Jan Rauberg

Abstract In this study, we have derived field-aligned currents (FACs) from magnetometers onboard the Defense Meteorological Satellite Project (DMSP) satellites. The magnetic latitude versus local time distribution of FACs from DMSP shows comparable dependences with previous findings on the intensity and orientation of interplanetary magnetic field (IMF) By and Bz components, which confirms the reliability of DMSP FAC data set. With simultaneous measurements of precipitating particles from DMSP, we further investigate the relation between large-scale FACs and precipitating particles. Our result shows that precipitation electron and ion fluxes both increase in magnitude and extend to lower latitude for enhanced southward IMF Bz, which is similar to the behavior of FACs. Under weak northward and southward Bz conditions, the locations of the R2 current maxima, at both dusk and dawn sides and in both hemispheres, are found to be close to the maxima of the particle energy fluxes; while for the same IMF conditions, R1 currents are displaced further to the respective particle flux peaks. Largest displacement (about 3.5°) is found between the downward R1 current and ion flux peak at the dawn side. Our results suggest that there exists systematic differences in locations of electron/ion precipitation and large-scale upward/downward FACs. As outlined by the statistical mean of these two parameters, the FAC peaks enclose the particle energy flux peaks in an auroral band at both dusk and dawn sides. Our comparisons also found that particle precipitation at dawn and dusk and in both hemispheres maximizes near the mean R2 current peaks. The particle precipitation flux maxima closer to the R1 current peaks are lower in magnitude. This is opposite to the known feature that R1 currents are on average stronger than R2 currents.


Author(s):  
Lior Shamir

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .


Author(s):  
Usman Naseem ◽  
Imran Razzak ◽  
Matloob Khushi ◽  
Peter W. Eklund ◽  
Jinman Kim

Sign in / Sign up

Export Citation Format

Share Document