How We Understand the Internal of Text Editor by Code Reading - part1

A lot of programmers know code reading is important, but there are less information about how to read code together with examples. This series tries to show the ways to approach and read code through a challenge of reading all the source code of Ace (Ajax .org Cloud9 Editor).

The aim of this challenge is not only demonstrating code reading with some tips, but also exploring some ideas for new functions of Read it easy.

At the beginning of this series, let's resolve the internal implementation of Edit function of Ace by code reading.

Preparation for code reading

Before beginning to read code, there are some preparations we can do. They are not hard to do, but effective enough to face a chunk of code.

Check the amount of lines of code

At first, we would like to check the amount of lines of code in Ace. This is useful to create a plan to read a chunk of code.

After counting it, Ace ver1.5.0 had more than 359k lines of code. The huge amount of code overwhelmed us. But without the directories of 'mode', 'test' and 'theme', it had about 74k lines. Not so bad in that case.

directory num of lines
/lib/ace 359,121
/lib/ace/mode 275,648
/lib/ace/test 2,893
/lib/ace/theme 6,504

We use the command git ls-files | xargs wc -l to count. In the case we have to read a huge amount of code, it is helpful for efficient code reading to get the information of how much lines there are or where large files are.

Thanks to this information, we got an idea to investigate some directories like 'mode', 'test' and 'theme'. These directories had a lot of files formed by similar patterns. So, it turned out we didn't have to read all the files in these directories. This approach could save 285k lines of code to read. This was also the benefit of checking the amount of code.

Code Reading Tips : Check the amount of lines of code before reading. Find the directories which are low priorities to read.

Get some information from the official site

We didn't begin to read code yet.

The official site is worth visiting to get some information before reading the source code. It is easier to understand or infer internal implementation if we have the background knowledge of the software like functional specs or use cases. By rule of thumb, such background knowledge increases the speed of code reading. There is no need to check the entire articles in this step. It is good enough to pick up some information you notice.

According to the official site of Ace, Ace had 14 core classes and the ER diagram showed the relationships between 6 classes.

class Events Methods
Ace - 3
Anchor 1 5
BackgroundTokenizer 1 7
Document 1 25
EditSession 14 90
Editor 7 153
Range - 26
Scrollbar 1 5
Search - 6
Selection 2 58
TokenIterator - 5
Tokenizer - 1
UndoManager - 6
VirtualRenderer - 72

ER diagram of Ace Figure 1. ER diagram of Ace / Image: Ajax .org Cloud9 Editor

Inference from ER diagram

We inferred the data flow of text below. Text data might be passed to and received from in this order while we edit them on Ace.

Document -> EditSession -> Editor -> VirtualRenderer

On the basis of this inference, we planned to read the source code of these four classes and resolve their roles on Edit function in Ace.

Inference before reading is also helpful to avoid getting lost. When we don't get any critical clues or hints during code reading, we are easily confused and getting lost. We never fail to write the inferenece down for the time we have to change direction.

Code Reading Tips : Infer structures or relationships before reading and write it down.

Document class

Let's begin to read code! It could be inferred from the class name that Document class should be a class dealing with text. And Document was a starting point of the data flow we inferred from the ER diagram.

At first, we took a look at the entire code (700 lines) of Document class. What functions did Document class have? We searched mainly variable names and method names. But this approach didn't bring much information. Next, we read a test file "document_test.js" before reading code line by line in document.js.

Test code shows how to use a class or a method from an external point of view. Additionally, a test name discribes the way to use a method explicitly. So it is often effective to read its test code in advance because we can understand the overview of a class or a method.

Code Reading Tips: To grasp an overview of a class, check its test code in advance and get the way to use the class or get some information like requirements.

Reading test names and tests themselves in document_test.js, document_test.js tested mainly the methods related to inserting or removing lines. When we went back to Document class with that point of view, many of the methods of Document certainly included "insert" or "remove" in their names. Therefore, Document class was probably related to manipulating text.

The next issue we wanted to resolve was what property a representative of Document was. To find a representative property, we read the code about properties or private methods starting at $ or _. The property $lines turned out to contain strings as an array. This was obviously a representative property of Document.

The above investigation revealed

  • Most of methods of Document were related to manipulating strings.
  • The representative methods of Document were insert and remove.
  • Document holded strings per line as an array in $lines property.
  • Text was in Document class.
  • Document didn't call other classes like EditSession, Editor and VirtualRenderer.

As the inference before reading, Document was a class to manipulate strings. Well, what other class calls Document? We didn't have the answer of this question at the time. Anyway, let's move to the next, EditSession class!