An introduction to GSM and thoughts about a phone tracking system
Mobile phone technology is a very complex topic which nowadays is mainly on the industria’s hands, far away from the citizens who only have the possibility of becaming a plain/passive user. For most of us a phone modem is just a black box where we don’t know what is going on.
I am not an expert on this topic but one day I decided to learn a bit of it. So I started reading about GSM technology. In my opinion the information is hard to find and understand for a non expert guy.
I made myself a question: would it be possible for a third party to track a mobile phone trajectory through a geographic area? Then I found the IMSI-CATCHER technology, pretty advanced nowadays. It is probably used by the governments and the police to track us whenever they find it necessary. But of course, it implies illegal technics such as send fake information to a private licenced band. So I made myself a second question: would it be possible for a third party to track a mobile phone using only legal methods? Obviously it implies not being active on the network and not breaking any encryption. Just listening on to the air.
On this article I try to summarize the basic ideas behind the mobile phone technology and make it easy to understand. I also performed two little experiments to answer my second question.
I hope this is useful for somebody at least as a entry point to this complex and obscure world.
GSM (Global System for Mobile Communications) is one of the current standards used by cell phones to send voice and data. A GSM network is composed by a Service Area (full network from an operator), several Mobile Switching Centres (MSC) composed by multiple Location Areas (LA) which is a set of base stations BTS.
Each BTS covers a small geographic area (a BTS is the set of antennas we can see installed on top of buildings, for instance). The Location Area can be as big as a city or even more, it depends on the operator.
The mobile terminal (phone) is the responsible for detecting different LA and, in case of witching, inform the network by executing a procedure named Location Update.
Over GSM can operate GPRS and EDGE to transport data packets (such as Internet). This is commercially known as 2G and 2.5G. In Europe the bands used for GSM are P-GSM, E-GSM and GSM1800.
UMTS and LTE
On this text we are focusing on GSM, but the following technologies should also be known.
UMTS (Universal Mobile Telecommunications System) was the successor of GSM. And LTE is the successor of UMTS. On UMTS operate the data protocols commercially known as 3G. On LTE what we know as 4G. Currently most of the new cell phones support the three technologies (GSM, UMTS and LTE).
3G/UMTS operates on 900M and 1200 Mhz. 4G/LTE operators on 800, 1500, 1800 and 2600 MHz.
IMSI and TMSI
A uniq identifier number (IMSI: international mobile subscriber identity) is assigned to any terminal operating in a GSM/UMTS/LTE network. It is usually composed by 15 digits and it comes in the SIM card used to subscribe the terminal to the operated network.
IMSI is composed by three parts:
- MCC: Mobile Country Code
- MNC: Mobile Network Code
- MSIN: Uniq identification number for terminal
When a terminal connects for first time to a network, the operator starts the procedure named IMSI Attach which identifies it using the IMSI number. After this registration and authentication process, the operator gives a temporary identifier number named TMSI (Temporary Mobile Subscriber Identity). It is used to hide the real identity (IMSI) of the subscriber. Its format is defined by the operator. It can be calculated using the current hour/date, a database index or some other information.
The TMSI number assigned (4 Bytes) is stored on the network VLR and also in the terminal SIM card. From now on this temporary identifier will be used for most of the procedures such as the Location update.
TMSI can be changed on several situations. It is changed when the terminal switch to another network device. But also after a call or after a period of time. It depends on the operator and there is nothing clear about it.
Pagination and system messages
The GSM network is constantly sending information (packets) using the Broadcast channels. These are received by al the terminals located on the same area. Cell phones can also communicate with the network to start a call or send a location update.
Some of the communication channels are:
- CCCH: Common communication channels used for broadcast system information (downlink)
- RACH: Random access channel used by the terminals to send data to BTS (uplink)
- BCCH: Broadcast channels used by the MSC/BTS to mainly announced itselfs (downlink)ada).
- SDCCH: Private channel used for the short communications, for instance to start a phone call or SMS. Or to start a Location Update
The pagination messages are notifications send by the MSC to all BTS composing a location area. A terminal registered on the area will receive this notifications (trhough a broadcast channel) and reply to them (if necessary) usually using a private uplink channel.
The pagination process is repeated constantly usually as the next schema shows:
As a reminder, a terminal must only update its location when changing of Location Area (controlled by a MSC). Thus usually the network does not know on which BTS range is the terminal located.
So the pagination request messages are used by the MSC to know which BTS must used to communicate with the terminal. On this requests are found the IMSI or TMSI of the terminal (one or multiple) to which they are directed. Such terminal, if received the pagination request will reply through the RACH uplink channel asking for a private channel assignment. Afterwards the BTS which has received the reply will make an “Immediate Assignment” with the required data to start the private communication such a SDCCH channel.
System messages are broadcast messages sent within a LA to provide information about the network and the BTS cells. The relevant types are:
- Type1: LIST of ARFCNs RACH control parameters.
- Type2: Neighbour cell description like LIST of ARFCN’s of the cell. Neighbour cell description – BCCH frequency list.
- Type3: Cell identity code decoded, LAI(MCC+MNC+LAC) decoded and some GPRS information.
- Type4: LAI(MCC+MNC+LAC) decoded, Cell selection parameters and RACH control parameters.
- Type2ter: Neighbour cell description like LIST of ARFCN’s of the cell. Neighbour cell description – Extended BCCH frequency list.
The Absolute Radio Frequency Channel (ARFCN) is a code which specify the channels (frequencies) available for the terminal to be used as uplink/downlink communication with the BTS. So capturing the ARFCN code we might deduct the RACH channel.
It is the process to update the location area. As already commented, the terminal must keep updated the area where it is located. Usually, after 60 minutes without area update, the network considers the terminal disconnected. In most of the terminals this process is executed every 10 or 20 minutes.
The terminal listen constantly on the BCCH channels from the MSC to know which base stations (BTS) are around and it changes to the one with better signal. Through this channel the terminal can see the location area identifier (LAI). If it is different from the previous one starts the procedure to update the location (LU). Note that at this point the TMSI will change.
Experiment 1: TMSI change
The objective of this experiment is to quantify the number of TMSI changes executed when traveling around one or several cities of medium size. To this end an Android phone LG E610 modified has been used. We use AT commands to get the network information (see attached script get_network_info.sh).
This is an output example for the script execution:
Sun Nov 8 10:12:00 CEST 2016 Provider: vodafone ES TMSI: 20491585 TMSI_time: 12 KC: 0E0993DE315C662401 CyphMode: 000000 CellID: C2EEA8 RSSI: -78 dBm
The experiment has been performed in Catalonia, over three cities at north of Barcelona of more than 50k inhabitants.
- Distance: 42.86 Km
- Time: 3h 35m
- Average Speed: 38 Km/h
- Technology: GSM 2G
- Company: Vodafone ES
These are the results for the experiment.
- Number of different GSM cells obtained: 33
- Number of Location Areas visited: 1
- Number of TMSI changes: 0
So the company (Vodafone on this case) uses very huge location areas and the TMSI is not changed at any moment. It makes things easier for tracking a phone.
As a proof of concept we also made a call and, in this case, the TMSI was changed.
Experiment 2: Capturing data
The objective for this experiment is to check the real viability for capturing GSM data using commodity hardware. To this end we used a TDT sintonizer compatible with the rtl-SDR software (cost is less than 10€). The device can listen frequencies to 1700 MHz, but we are making the experiment in the 900 MHz GSM band. We will try to find a CCCH channel and capture the pagination packets sent by the BTS to try to identify IMSI/TMSI numbers.
Using the software grgsm_livemon we can identify a downlink broadcast channel by looking at the wave graph. When we move the center frequency to the one corresponding the broadcast channel, grgsm_livemon will detect it and will start demodulating the information. The data is given to the user through IP packets in a virtual network interface named gsmtap.
The next step is to start Wireshark to analyze the data using the corresponding filters.
On the image caption we see that in very few seconds we have alrady captured many pagination packets with the TMSI/IMSI in it. We can also see the “System information” packets which can be used to obtain information regarding the network.
To go a bit further, we have found and modified a Python script which analyzes automatically the data packets captured by grgsm_livemon and extracts the IMSI and TMSI.
However, these TMSI/IMSI captured are not necessarily under the coverage of the BTS where we are listening on. They can be located on any other point of the Location Area.
On the other side the “Immediate Assignment” message captured are sent by the BTS as a terminal request for a private communication. But there is not a direct relation between the Paging Request and the Immediate Assignment, so in consequence it is not easy to deduct on which BTS is the terminal attached.
To the question: Is it possible, for a third party, to follow a terminal through an area using just passive methods? The first answer according what we experimented is NOT. But there are two possible solutions hard or maybe impossible to implement:
A system able to deduct the uplink channels (RACH or SDCCH) used by a specific BTS, by analyzing the “System Information” packets captured on the CCCH channel. It would be possible to deduct on which BTS (CellID) is the terminal associated (capturing the IMSI/TMSI). So if we know where the BTS is geographically located we can estimate the trajectory of the terminal during a period of time.
A system able to find a relation between the “Paging Request” (sent by all BTS, we see the TMSI/IMSI) and the “Immediate assignment” (sent just by the BTS where the terminal is attached). So we can know to which BTS is the terminal attached and deduct its geographical location. Maybe the time between PR and IA might be used to create such relation, but accepting a high percentage of error.
Of course to implement a tracking system, it is needed to deploy an important amount of devices among the territory to control.
dev="/dev/smd11" echo "" date echo -e "AT+COPS?\r" > $dev PROV="$(timeout -t 2 cat $dev | grep "COPS:" | cut -d, -f3 | tr -d \")" echo "Provider: $PROV" echo -e "AT+CRSM=176,28542,0,0,11\r" > $dev TMSI="$(timeout -t 2 cat $dev | grep "CRSM:" | cut -d, -f3 | tr -d \")" echo "TMSI: $(echo $TMSI | head -c8)" echo "TMSI_time: $(echo $TMSI | head -c10 | tail -c2)" echo -e "AT+CRSM=176,20256,0,0,9\r" > $dev KC="$(timeout -t 2 cat $dev | grep "CRSM:" | cut -d, -f3 | tr -d \")" echo "KC: $KC" echo -e "AT+CRSM=176,28589,0,0,3\r" > $dev KCM="$(timeout -t 2 cat $dev | grep "CRSM:" | cut -d, -f3 | tr -d \")" echo "CyphMode: $KCM" echo -e "AT+CREG=2\r" > $dev echo -e "AT+CREG?\r" > $dev CID="$(timeout -t 2 cat $dev | grep "CREG:" | cut -d, -f4)" echo "CellID: $CID" echo -e "AT+CSQ\r" > $dev RSSI="$(timeout -t 2 cat $dev | grep "CSQ:" | cut -d: -f2 | cut -d, -f1)" dBm=$((-112+2*$RSSI)) echo "RSSI: $dBm dBm"