Source Code Navigation Tools

Use Case

My current day job is as a lead verification engineer for an embedded firmware team. As a result, I spend the majority of my time going through code which was written by colleagues of mine. One of the main pain points of debugging the code of a colleague, is the fact that you may not know where various functions definitions or variable declarations live.

Well designed software ideally ensures consistent coding style and formatting. This would make traversing a large code base more intuitive, and would ideally resolve the code organization issue altogether. My personal experience has shown that production code organization takes a backseat to production/code functionality and customer deadlines.

Another typical case for code navigation tools is for legacy code support. If you’re stuck in a position supporting legacy code, it is seldom a good idea to re-factor poorly organized code for the sole purpose of maintenance, and current approved good coding practice. From a business mindset, changes to a stable code-line carry a quality risk. Every change committed by developers is an opportunity to introduce bugs. Therefore, all changes to mature code-lines must be justified and the risk of introducing new bugs must be considered. Therefore, re-factoring legacy production code is usually avoided.

So now that we’re stuck navigating sub-optimally organized code, what can we do about it?

On most Linux systems, there are two tools pre-installed which allow for easier navigation of code.

  • CTAGS
  • CSCOPE

The discussion below will start with CTAGS, followed by CSCOPE. The CSCOPE discussion will be done as a logical progression of CTAGS. Given that CTAGS stores code index information in plain text, it is a little easier to understand what is going on beneath the hood; and so we will start there, and move onto CSCOPE. This also matches my own progression with these tools. I became aware of CTAGS, and was glad I found it. CTAGS allowed me to navigate and debug code authored by others much more efficiently than before. After getting familiar with the basic functionality of CTAGS, I wanted a more complete code navigation tool with more features, and then I found CSCOPE.

Details of both tools are provided below.

Details of this demo

“The internet” usually refers to the linux kernel as a large, complicated codeline. I will be using this code for my demo, to show how quickly new developers can navigate through foreign code. I have never worked on the Linux kernel source code, so this should be a good test of the navigation tools. It will also be a test-case which can be easily replicated by readers of this blog.

I will be using the following GIT repository::

https://github.com/torvalds/linux

CTAGS

CTAGS generates an index file ( called a tags file ) of the source code in the user’s current working directory.

Getting set-up

First, we must ensure CTAGS is installed on our machine.

$ which ctags

/usr/bin/ctags

Next, we need to obtain a local copy of the Linux kernel.

git clone https://github.com/torvalds/linux.git

Building tags file

Next, we’ll change into the Linux folder, and run the tool.

$ cd linux

$ find . -name “*.c” | xargs ctags

The above command lists all files within the currnet working directory ( and sub-directories ), and then analyzes them by ctags.

$ find . -name “*.h” | xargs ctags -a -o tags

The above command analyzes all the header files, and appends the data to the tags file.

The result of these commands is shown in the file snippet below::

!_TAG_FILE_FORMAT	2	/extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED	1	/0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR	Darren Hiebert	/dhiebert@users.sourceforge.net/
!_TAG_PROGRAM_NAME	Exuberant Ctags	//
!_TAG_PROGRAM_URL	http://ctags.sourceforge.net	/official site/
!_TAG_PROGRAM_VERSION	5.8	//
ACCT_TIMEOUT	./acct.c	/^#define ACCT_TIMEOUT	/;"	d	file:
xsk_map_node_free	./bpf/xskmap.c	/^static void xsk_map_node_free(struct xsk_map_node *node)$/;"	f	file:
xsk_map_ops	./bpf/xskmap.c	/^const struct bpf_map_ops xsk_map_ops = {$/;"	v	typeref:struct:bpf_map_ops
xsk_map_put	./bpf/xskmap.c	/^void xsk_map_put(struct xsk_map *map)$/;"	f

This file maps function names and variables, with a filename and search string. With this information, your text editor can open the correct file and navigate to the correct function definition.

A few useful shortcuts

To jump to the definition of the function which is under the cursor, press “Ctrl + ]”

The transitions are kept in a stack. To pop the stack, and jump back to your previous position, press “Ctrl + t”

Summary

With a few useful shortcuts, and the use of source code indexing tools such as CTAGS, developers can quickly trace through source code, without worrying about file organization.

Room for improvement

After reviewing the tags file, it is clear that the only information which is indexed in this file ( by default ) is symbol definition. However, there are instances which I would like to execute the following command::

Find all locations which function “x” is called

OR

Find all assignments to variable “GO_FAST_SWITCH”

With CTAGS as used in this demo, this cannot be done.

Enter CSCOPE

CSCOPE

CSCOPE provides overlapping functionality with CTAGS, as well as additional functionality. The “jump to and from” functionality is the same. Some of the additional functionality which CSCOPES adds is::

  • Find this C symbol
  • Find this global definition
  • Find functions called by this function
  • Find functions calling this function
  • Find this text string
  • Find this egrep pattern
  • Find this file
  • Find files #including this file
  • Find assignments to this symbol

The additional functionality above allows for much easier traversal through source code authored by others. Unlike CTAGS, cscope creates a binary index file which contains all of this information, and relational information mentioned above. However, the down-side of cscope is that it is a little more difficult to configure for use within VIM. I imagine the same statement applies with the EMACS text editor.

Getting set-up

First, we must ensure CTAGS is installed on our machine.

$ which cscope

/usr/bin/cscope

Next, we need to obtain a local copy of the Linux kernel.

$ git clone https://github.com/torvalds/linux.git

Building tags file

$ cd linux

$ cscope -R

The above command will build the relational symbol database cscope.out.

Configuring VIM for use of cscope

When I was first configuring this tool myself, I used the following set-up guide from SourceForge.

From this resource, we see that there is a vimscript file. So long as you source this file within your vimrc file, CSCOPE shortcuts within VIM will be available to you. The basic objective of this vimscript file is to provide quick shorthands to full commands.

For example -

nmap <C->s :cs find s =expand(“”)

This command will execute “cs find s <$VAR>”, where <$VAR> is the word under your cursor.

A few useful shortcuts

The main shortcut, which will allow you to access all CSCOPE commands is::

CTRL + \

This command should be followed by a character to specify the relationship to find. For example -

CTRL + \ -> s

When “s” is following “Ctrl + ", then all references to this symbol will be listed.

CTRL + \ -> c

When “c” is following “Ctrl + ", then all calls to this symbol will be listed.

Summary

Given that CSCOPE provides all functionality which CTAGS does in addition to other relational search features, I have switched to exclusively using CSCOPE.

I hope that this short demo has highlighted the value of these tools. These two tools have made my day-to-day work significantly easier, and so I hope to share this with others.