LiveLab traces
We have made availiable the traces we have collected subject to the following license agreement. Downloading, obtaining, and/or using the traces in any means constitutes your agreement with these terms.
Copyright and License
- We grant you a nonexclusive, nontransferable license to use the data for commercial, educational, and/or research purposes only. You agree to not redistribute the data without written permission from us.
- The traces from human subjects have been anonymized by us. To respect the privacy of those human subjects whose activity is captured by the data, you will not attempt to reverse the anonymization process. Furthermore, you will not disclose, nor attempt to reveal, any private information of our participants, including but not limited to their identity and location.
- You agree to acknowledge the source of the data, i.e., the LiveLab project, by citing the following paper in your publication or product:
Clayton Shepard, Ahmad Rahmati, Chad Tossell, Lin Zhong, and Phillip Kortum, "LiveLab: measuring wireless networks and smartphone users in the field", in ACM SIGMETRICS Perform. Eval. Rev., vol. 38, no. 3, December 2010.
Where relevant (e.g., web usage), you will also cite the Tempo project:
Zhen Wang, Felix Xiaozhu Lin, Lin Zhong, and Mansoor Chishtie, "How far can client-only solutions go for mobile browser speed?" in Int. World Wide Web Conf. (WWW), April 2012.
- We provide no warranty whatsoever on any aspect of the data, including but not limited to its correctness, completeness, and fitness. Use at your own risk.
- For more information regarding the data and how it is collected, please refer to the papers above. We do not provide any further support regarding this data.
NOTE: Downloading, obtaining, and/or using the traces in any means constitutes your agreement with these terms
The traces are broken down into a number of .sql files. Each .sql file contains one table, as described below. Each measurement is stored as an entry in its relevant table.
- id: unique ID # (index) for row (internal to LiveLab)
- name: name of application (unique)
- genre: Genre of application, as reported by Apple's App Store
- price: price of application, in thousandths of a dollar.
apps.sql: list of all installed applications, among all users
[download]
- id: unique ID # (index) for row (internal to LiveLab)
- uid: user ID
- name: name of application (unique to application)
- time: time and date (POSIX)
- duration: duration for which the application was running in seconds. Note that turning the screen off effectively exits the application
appusage.sql: applications run by users (event / built-in logfile driven)
[download]
- uid: user ID
- rowid: unique ID # (index) for phone calls (internal to LiveLab)
- number: phone number. This is one-way hashed for privacy.
- time: time and date (POSIX)
- duration: duration of phone call,
- flags: Apple proprietary format for call attributes (incoming/outgoing, dropped, etc.)
- id: N/A (used internally by Apple)
call.sql: phone calls made / received by users (event / built-in logfile driven)
[download]
- id: unique index (internal to LiveLab)
- uid: user ID
- time: time and date (POSIX)
- duration: amount of time in this state
- sleeping: asleep (t) / awake (f)
sleep.sql: time that the phone spent in low power sleep mode
[download]
- id: unique index (internal to LiveLab)
- uid: user ID
- time: time and date (POSIX)
- duration: duration display had been in said state
- displayon: display status (on/off)
display.sql: display status (event / built-in logfile driven)
[download]
- id: unique index (internal to LiveLab)
- uid: user ID
- time: time and date (POSIX)
- Duration: amount of time in this state
- Charging: charging status
charging.sql: charging state of phone (interrupt driven)
[download]
- id: unique index (internal to LiveLab)
- uid: user ID
- time: time and date (POSIX)
- battery: battery level (percent)
- mah: battery voltage (millivolts)
- current: current flowing into battery (milliampers)
- charging: battery being charged at this time
- charged: battery full at this time
power_detail.sql: (event / built-in logfile driven)
[download]
- uid: user ID
- time: time and date (POSIX)
- x: measured acceleration in the x axis (g)
- y: measured acceleration in the y axis (g)
- z: measured acceleration in the z axis (g)
accel.sql: accelerometer readings (typically every 15 minutes when phone is awake)
[download]
- uid: user ID
- time: time and date (POSIX)
- disk0_kbt: kilobytes transfered per disk transfer
- disk0_tps: number of transfers per second to flash
- disk0_mbs: transfer rate to flash memory in MB/s.
- cpu_user: percent of time cpu has been used by user processes since power on
- cpu_sys: percent of time cpu has been used by system processes since power on
- cpu_idle: percent of time cpu has been idle since power on
- load_1m: load for last minute (see this for more details)
- load_5m: load for last 5 minutes (see this for more details)
- load_15m: load for last 15 minutes (see this for more details)
iostat.sql: (periodic output from iostat: cpu and disk utilization)
[download]
- uid: user ID
- time: time and date (POSIX)
- geoid: the md5 hash of the geoid
- towerid: the md5 hash of the towerid
celltower.sql: (periodic output from the modem regarding the cell tower the phone is connected to)
[download]
- uid: user ID
- time: time and date (POSIX)
- csq: cell signal quality (dBm = csq * 2 - 113, reported by the baseband using AT+CSQ)
- ber: bit error rate (reported using AT+CSQ)
cellsignal.sql: (periodic output from the modem regarding the cell signal strength)
[download]
- uid: user ID
- time: time and date (POSIX)
- ssid: the md5 hash of the ssid
- bssid: the md5 hash of the bssid
- channel: the WiFi channel
- rate: the connection rate (Mbps)
- rssi: the reported received signal strength indication (dBm)
associatedwifi.sql: (periodic output regarding the connection to the associated WiFi access point)
[download]
- uid: user ID
- starttime: time and date (POSIX) of the time the scan began (some scans take many seconds to complete, so this allows entries to be grouped together)
- time: time and date (POSIX) that this network was recorded
- ssid: the md5 hash of the ssid
- bssid: the md5 hash of the bssid
- channel: the WiFi channel
- rssi: the reported received signal strength indication (dBm)
availablewifi.sql: (periodic output regarding the available wifi access points)
[download]
- id: unique index (internal to LiveLab)
- uid: user ID
- time: time and date (POSIX)
- duration: duration in seconds
loggeron.sql: time that the logger was running
[download]
- uid: user ID
- time: time and date (POSIX)
- count: see below
- url: hashed URL
web.sql: web browsing history
[download] [csv]
Understanding count
The web history was collected nightly from Safari's history file on iPhone 3GS, which only stores the timestamp of the most recent visit to a specific URL, but increments the "count" each time that URL was visited. Notably, this count is reset automatically for older entries, or can be reset by the user. Because of this, our logger may capture an entry multiple times between visits, or after a reset.
Therefore, a heuristic is needed to determine the correct visit count. For example, all the following examples would indicate 3 total visits to google.com/sub1/ AFTER TIME = 1000
Time, #count, URL
1000, 4, google.com/sub1/
2000, 1, google.com/sub1/
4000, 3, google.com/sub1/
1000, 4, google.com/sub1/
2000, 1, google.com/sub1/
3000, 2, google.com/sub1/
4000, 3, google.com/sub1/
1000, 4, google.com/sub1/
4000, 3, google.com/sub1/
1000, 4, google.com/sub1/
2000, 1, google.com/sub1/
3000, 2, google.com/sub1/
4000, 1, google.com/sub1/
Understanding the URL hashing
The urls are hashed due to privacy concerns. A url can be parsed into six fields: scheme, netloc, path, params, query, and fragment. The field "netloc" has up to four sub-fields: username, password, host, and port. (ref: urlparse and URI scheme ) Different rules are applied to different fields:
- scheme: not hashed
- netloc:
- username: one hash value
- password: one hash value
- host: one hash value per subdomain
- port: not hashed
- path: one hash value per sub-path
- params: one hash value
- query: one hash value
- fragment: one hash value
Some general domain information for the field "host" is not hashed so that a user's browsing behavior can be better understood from the data. In particular, the following subdomains are not hashed:
- top-level domains (ref: List of Internet top-level domains)
- second-level domains if they are one of these: "ac", "co", "com", "net", "org", "edu", "gov", "sch", "gc"
- bottom-level domains if they are one of these: "www", "login", "m", "mobile", "iphone", "touch"
Here is an example on how a url is hashed, assuming hash() is the hash function. hash() returns "hash#", where "#" is 1 to 9 in the following example.
url:
http://username:password@www.example.com:80/over/there/index.dtb;parameters?type=animal&name=narwhal#nose
hashed url:
http://hash(username):hash(password)@www.hash(example).com:80/hash(over)/hash(there)/hash(index.dtb);hash(parameters)?hash(type=animal&name=narwhal)#hash(nose)