
Introduction
============

PatchTools is a tool to evaluate patches for inclusion into source trees managed by
the GIT source code control system, notably the Linux kernel. The scenario in which
PatchTools is intended to be used is:

* You have a patch set that was developed on differences detected between an old
  source version "a" and a new source version "b".
  
* You have a different source version "c" to which you wish to apply these patches.

* You don't want to, or can't switch, to an older source version because doing so
  would make important features, bug fixes, etc. disappear.

* You may want to work with snapshot archives of GIT source trees instead of using GIT
  if you don't plan to upstream any changes you make, if you don't plan to make any
  changes, or if you don't want to use GIT for local source code management.

Using "git am" or quilt to apply the patches often will not work because:

* The lines specified in a patch are missing or elsewhere in the same file in tree "c".
* The "a" or "b" file does not exist in a tree where it is supposed to be present.
* The "a" or "b" file is present in a tree where it is supposed to be absent.
* A later patch in your set fails because it depends on an early patch that failed.

Often these problems are caused by the integration of other patches into the "c" source tree
that modified the same files as your patches, thus invalidating the line numbers in
your patches.

PatchTools is primarily designed for use with Linux patches, but can be used with any software system
that uses GIT for source code control.

PatchTools assumes you have appropriate permissions to access any of the files referenced
by PatchTools modules.

PatchTools has been tested on Linux Mint 16, using both Python 2.7.5 and 3.3.


Installation/Setup
===============================

There are two modes of installation: to one of your Python distributions, and as a
stand alone source tree.

Python Installation
-------------------

PatchTools may be installed into one of your Python distributions using pip or easy_install.
It may also be installed from source into a distribution by entering 'sudo python setup.py install'
or 'sudo python3 setup.py install' in a shell while you are located in the root source folder.

If you install to a Python2.x distribution, the Python modules will be placed, for example, in
'/usr/local/lib/Python2.7/dist-packages/patchtools', and there will be a metadata file
'patchtools-1.0-egg-info' file in the same folder to describe the package. The doc and examples
data files will be placed in a '/usr/local/patchtools' folder.

If you install to a Python3.x distribution, the Python modules will be placed, for example,
in '/usr/local/lib/Python3.3/site-packages/patchtools, and there will be a metadata file
'patchtools-1.0-egg-info' file in the same folder to describe the package. The doc and 
examples data files will be placed in the same '/usr/local/patchtools' folder.

Standalone Source
-----------------

To use PatchTools as a stand alone source tree, download the tarball, and extract its files to a suitable
location. Then you must create a PYTHONPATH environment variable as described in the "System Considerations"
section below.

The software may also be imported into an Eclipse-PyDev project using the 'File:Import' menu and the
'File System' option. Ensure that the Eclipse value of PYTHONPATH is set to the parent folder of the
source tree. Note that if you select a Python3 interpreter, the Command module, which is used by the
Viewer, will not be able to launch a gedit that is not Python3 compatible. The causes for this are
unknown, but PyDev is suspected, since such requests work properly in a shell. 

Data Files
----------

After installing PatchTools, you must set up some test data. Due to the large number of files contained
in a Linux kernel, and in some patch sets, test data is not included in the release.
   
In a suitable location create a 'data' folder with these sub folders:
   
   archives
       To hold any patch archive files you download from kernel.org, etc.
   
   patchset
       A folder to contain a tree of patches. You may have more than one such folder,
       if they have different names.
   
Download any patch archive files to the archives folder.
   
Download any patch sets to the patchset folder(s).
   
Download any source trees (e.g. Linux kernels) to convenient locations. It is not
recommended to store a Linux kernel in an Eclipse workspace, due to the large number
of files a kernel contains.
  
Finally create suitable config.json files to describe and link to the data. See the config.json files
in the examples folders for typical contents.
       
    
Modules
=======

You can use PatchTools can speed up the process of determining suitability of your patches
to a new kernel by:

* Using the *Walker* to enumerate a useful subset of files in your kernel tree.

* Using the *Finder* to identify files that have specific content.

* Using the *Checker* to determine the compatibility of the patch info to your target kernel.
  
* Using the *Watcher* to monitor patch archive files distributed with newer kernels.

* Using the *Viewer* to edit related files simultaneously.

* Using the *Helper* utilities to speed development of these procedures.

	
Major Modules
-------------

The finder module implements a class *Finder* to facilitate searching file trees for patterns of interest to you.

The checker module implements a class *Checker* to compare the contents of patch
files to the source files they reference. *Checker* objects can be used to determine if
the changes in a patch are compatible with your target kernel.

The viewer module implements a class *Viewer* to facilitate editing patches and the files they
reference. Methods are implemented to handle special cases such as displaying a patch and
the files referenced by the patch.

The walker module implements a class *Walker* to allow enumeration of selected directories and files
in a file tree. Both directories and specific file types may be included in or excluded from the results.

The watcher module implements a class *Watcher* to compare the contents of patch
files to a "patch archive" file associated with a kernel release. You can use the *Watcher*
to determine if any of your patches have been integrated into newer kernels than the one on
which the patches were developed.

The helper module implements a class *Helper* to ease the task of assembling the fairly
numerous required and optional parameters of the modules listed above. It provides wrapper
functions for the modules, and some useful utility functions.

Supporting Modules
------------------

The archive module implements a class *Archive* to extract information from "patch archive" files
associated with kernel releases.

The command module implements a class *Command* to provide a simple wrapper for the Python subprocess module.

The functions module implements a class *Functions* to provide various utility functions.

The jsonconfig module implements a class *JSONConfig* to allow application configuration
using enhanced JSON data files.

The strings module implements a class *Strings* to provide useful string like methods for
lists of strings.


Archive
-------

Archive objects are used by the *Watcher* to extract information from "patch archive" files
provided by kernel.org for their kernel releases. The files contain a list of the patch diff
sections that were applied to obtain the release version. See Appendix B for more information.


Checker
-------

*Checker* objects compare the contents of patch files to the source files they reference.
Checker objects are used to determine if the changes in a patch are compatible with your target kernel.

The *Checker* class accomplishes its goals by the following steps:

* The patch file is read and parsed into a Patch object.

* The path specifications in the 'diff' sections are verified to match, and the files
  they reference to exist or not exist in the source tree as appropriate.

* The line number specifications in the 'hunks' are verified to fall within
  the numbers of actual lines in the files they reference.

* The edit changes are tested against the "c" version of file.

The *Checker* has two primary modes of operation:

* In 'full' mode, all errors are reported, but lines that passed testing are not.

* In 'complete' mode, a status is reported for each hunk edit line.

The *Checker* has several optional features:

* If the 'find' option is True, the code will attempt to find missing lines in the
  target file. If no matches are found in a hunk, the program will attempt to find
  instances of the hunk's 'note' in the target file. But the search is limited to
  'landmark' lines, i.e. those lines that are expected to be unique in the file.
  Due to the complexity of typical C code, non relevant matches may be reported
  even for complicated expressions.

* The 'targets' option can be used to limit checking to diffs that reference specific files.

See the 'Checker' section of the API documentation for call details.
  
Additional Notes
................

Since various mail systems and editors can corrupt the patch files as they are in transit,
the *Checker* normalizes the patch path lines ("--- ...", "+++ ..."), the 'diff' lines
("diff --git ...") and the hunk range lines ("@@ ...") before splitting them
for extraction. The normalization consists of replacing all tabs by spaces
and replacing multiple spaces between words of a line by single spaces.

The *Patch* parsing logic will discard any patch lines that have been commented out by surrounding them
with lines containing only '"""'. See the 'Patch' section of the API documentation for
more details. This feature can be used to narrow the focus of your investigation to a small
set of patches and files. But if you plan to use 'git am' on a patch later, you should do
the commenting in a copy of the patch.

Usage
.....

To start a test against your target kernel, you may execute::

    f = h.load("dts_patch_refs.lst")
    m = h.check(f, h.extend(c['defaults'], { "mode" : "full", "find" : False, }))
    h.write(m, "match_dts.txt")
    
Assuming that you have used the *Finder* to locate patches that refer to some DTS files,
and saved the results into "dts_patch_refs.lst", this code will check those patches against
your kernel version.

See the 'Checker' section of the API documentation for further info on its parameters
and methods.

See 'Appendix E - Checker Output' for explanation of *Checker* output.
 

Recommended Usage Strategy
..........................

If the *Checker* is used on all patches in a large set, it can provide you with a very large amount
of bad news concerning the state of your patches, in part because it does not take into account
dependencies between patches. It is useful to narrow the scope of your investigations to a subsystem,
group of patches or group of files to analyze. If you decide to fix a series of related patches,
you should fix the first one in commit order, test the others again to see if any problems have been
resolved, and repeat this process down to the last patch.

The *Walker* and *Finder* can be used to generate small lists of files related to specific subsystems, based
on matches to text strings such as "am33xx", etc::

    l = h.load("patch_names.lst")
    m = { "substr" : ["am335x-bone", "am33xx.dtsi", "tps65217.dtsi" ],
    f = h.find(m, { "root_path" : c['patchdir'], "file_paths" : l, "format" : "files" })',
    h.save(f, "dts_patch_refs.lst")'

The *Checker* can be used on one or a few of these files at a time::

    f = h.load("dts_patch_refs.lst")
    g = h.PatchSet(c['defaults']).sort_patches({ "patches" : f, "order" : "patchset" })
    h = h.check(g, c['defaults'] + { "find" : True })

Note that the *Walker* and *Finder* do not return lists of patch names in patchset order, which is supposed
to be the commit order needed for successful use of 'git am'.

If the *Checker* 'targets' option is used, the *Checker* will only scan diff sections that modify the files
specified in the targets list. For example, you could select BeagleBoard related device tree files for investigation::

    f = h.load("dts_patch_refs.lst")
    g = h.PatchSet(c['defaults']).sort_patches({ "patches" : f, "order" : "patchset" })
    t = ["am335x-bone.dts", "am335x-boneblack.dts", "am335x-bone-common.dtsi",
         "am33xx.dtsi", "am33xx-clocks.dtsi", "tps65217.dtsi", "Makefile"]
    d = h.check(g, h.extend(c['defaults'], { "find" : True, "targets" : t }))
    h.write(d, "dts_matches.txt")'

When the *Checker* encounters a diff file name that is not in its targets list, it will issue a message like::

    # SKIPPING DIFF: "diff --git a/arch/arm/boot/dts/am335x-bone.dts ...

The *Patch* module used by the *Checker* to parse patches will discard any patch lines that are surrounded
by lines containing only '"""'. If you observe that an initial *Checker* report indicates that all the errors
in a hunk are like '"delete" line not found' or '"merge" line not found' and that the target file does not have
the specified "delete" lines, or that "add" lines are already in the target file, you can comment out the diff
or hunk to further narrow the scope of your investigation to hunks or diffs that actually need to be fixed.
This strategy was used heavily in Example 5 described below.

If all diffs in a patch are commented out, the *Checker* will issue a message like::

    INFO:  skipping empty patch

Some example uses of this feature are:

    To comment out a diff for a file in which you are not interested::

        """
        N/U not BeagleBone Black files
        diff --git a/arch/arm/boot/dts/am335x-bone.dts b/arch/arm/boot/dts/am335x-bone.dts
        ...
        """

    To comment out a diff or a hunk which is not needed::

        """
        N/N am33xx.dtsi has the add lines and more values
        diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
        ...
        """

The use of 'N/N' and other codes to start explanatory notes is conventional, and is not interpreted by the software.

The 'vcpf' (view checker output, patch and its files) function of the *Helper* can be very helpful in determining
if a patch is correct or if it is needed.

See the files in the "examples" folder for complete usage examples.


Command
-------

Command objects provide a simple wrapper for the Python subprocess module.
Command execution can be synchronous, or asynchronous. Normally, synchronous mode
will be used by the *Helper*, except that most of its view commands use asynchronous mode.

Commands may be passed as strings or as lists of arguments.

Command strings are executed in shell mode::

    cd ~;pwd;ls

There appear to be some limitations to this approach, as in a command string like::
        
    cd ~; source .profile;cdlsk
    
where 'cdlsk' is an alias defined in .profile:

* The shell will not be able to find .profile unless you reference it as './.profile'
    or '~/.profile'.

* The 'cdlsk' alias will not be recognized.

* You cannot split the command into multiple invocations because the environment created
  by 'source ./.profile' will be lost when the sub process exits.

* Commands that normally produce output organized in columns in an interactive shell,
  e.g. 'ls', will instead produce a list of items separated by '\\n' characters.

The usage of stdout and stderr by Linux programs is variable, for example:

   * bash may return zero, indicating success, but also return error output.
   * wget returns normal output on stderr.

Some programs use extended ASCII (aka Latin-1) characters (i.e. ord(ch) >= \x80 in Python terms)
in their output. For example, certain commands surround filenames with left single
quote and right single quote characters. Even though Eclipse and PyDev can display such
characters, Python3 will convert them to hexadecimal escape sequences (e.g. lsquo is represented as '\\xe2\\x80\\x98').
The *Command* class converts lsquo and rsquo to ' as needed.

Python2.7 returns stdout and stderr data as strings, while Python3.x returns them as byte arrays.
Command translates the values to strings as needed.

See the 'Command' section of the API for call details.


Finder
------

*Finder* objects try to find references to patterns you specify in a file or a file tree.
For each pattern you select, the *Finder* will return a list of references it finds in the target file or folder.

Unlike programs such as ctags and cscope, the *Finder* does not attempt to index your entire software tree,
but instead focuses on the folders, files and text patterns you specify.

Either absolute or relative paths may be used in specifying the search root.
When relative paths are used, they must be accessible from the caller's current directory.
    
Using common patterns such as 'dma' may produce a large amount of output,
particularly if you set the search root to the root of a kernel tree, so your choices of
root and patterns should be made with care.

See the 'Finder' section of the API documentation for more information on the class methods.


JSONConfig
----------

A *JSONConfig* object holds application configuration data taken from enhanced JSON encoded files.

The JSON files may contain line comments and block comments.

Line comments are lines that start with '#' (ignoring leading white space),
and will be removed in file loading.

Block comments are coded by inserting a line with only """ before and after the lines to be commented out.
Any lines between such lines are removed in file loading.

Once loaded the object may be accessed by key indexing of its dict super class instance, or by use of
the get and set methods. These two methods support convenient multilevel indexing by the use of
"path expressions". For example, '...get("mysql_options/admin_profile/data_base"' will
return the value of self["mysql_options"]["admin_profile"]["data_base"]. The user can specify the path
separator character in the 'separator' option passed to the constructor.

Python's JSON loader strictly requires the data to have correct JSON syntax, and will
generate exceptions if it doesn't. To avoid confusing users by presenting them with
line numbers in the stripped line set, JSONConfig will catch exceptions raised by the JSON decoder,
map the exception line numbers back to their equivalents in the original file,
and reraise the exceptions as JSONConfigError objects.

On Python 2.x, the get method will translate unicode dict values to str objects.

See the 'JSONConfig' section of the API doumentation for call details.

See the test*.py programs in the examples folder, and the '__init__' method of the *Helper* for typical usage.


Matcher
-------

*Matcher* objects match strings to patterns. Used by the *Walker* to filter file names
and the *Finder* to filter text strings, the *Filter* allows you to specify match patterns by:

    match
        a list of exact match patterns
    prefix
        a list of prefix patterns
    suffix
        a list of suffix patterns
    substr
        a list of substring patterns
    regexp
        a list of regular expression patterns
    funcs
        a list of callback functions
        
Patterns are tested against strings in the order shown above.

Some examples::

    f1 = Matcher({ "prefix" : ["Kconfig", "Makefile"] })
    f2 = Matcher({ "suffix" : [".dts",".dtsi"] })   
    f3 = Matcher({ "substr" : ["am33xx.dtsi", "am335x-bone.dts", "am335x-boneblack.dts",
          "am335x-bone-common.dtsi", "am33xx_pwm-00A0.dts", "bone_pwm_P8_13-00A0.dts" ] })
    f4 = Matcher({ "regexp" : [r".*am335x\-b.*\.dts.*", r".*am33xx.*\.dts.*", r".*bone.*\.dts.*"] })

Parameters like f1 can be used by the *Walker* to find all files whose names begin with "Kconfig" or "Makefile"
in the folders you told it to search.

Parameters like f2 can be used by the *Walker* to find all files whose names end with ".dts" or ".dtsi".

Parameters like f3 or f4 can be used by the *Finder* to find all references to the specified strings in your patch
or source files.

Using regular expressions may eliminate the need to use line continuations, but it can be difficult to formulate
expressions that produce exactly the same result as simpler combinations of 'substr', etc.

The 'encoding' example shows how to use callback functions to select files for testing.

See the 'Matcher' section of the API documentation for call details.


Patch
-----

Patch objects parse the strings of a patch file into an object suitable for analysis.
The object will contain a list of Diff objects.

Patch objects will discard any patch lines that have been commented out by surrounding them
with lines containing only '"""'. If you plan to use 'git am' on a patch later, you should
do the commenting in a copy of the patch.

See the 'Patch' section of the API documentation for call details.

See checker.py for example usage of the Patch class.


PatchSet
--------

Patch sets are normally organized in two level trees, with a root folder and sub directories
for specific topics, e.g. 'dma'. The 'name' of a patch consists of its subdirectory name
joined to its file name by  '/'.

A patch set description is a dict that specifes the order in which the patches are to be applied.
Its 'groups' element is a list of the topic specific sub folders mentioned above. Within each
sub folder, patches are intended to be applied in the order indicated by the first four characters
of the patch file names. This order was encoded by using 'git format-patch' or quilt to format
the patches.


Strings
-------

Strings objects provide useful string like methods for lists of strings.

Note that taking a slice of a Strings object will always return a Strings object.

Strings is used by the Watcher class, as well as by the Diff, Hunk and Patch classes.
See those files and the examples files for more usage examples.


Viewer
------

Viewer objects allow you to view sets of related files, using graphical or nongraphical editors.

The default editor on Linux is 'gedit', which allows numerous files to be displayed in a single window.

The editor default can be overridden by passing an editor specification to the constructor,
as shown in the API section below.


Walker
------

Walker objects walk a file tree and return the path of each discovered file.
Directory and file filters may be applied to narrow the scope of a search in a large file tree.


Watcher
-------

Watcher objects facilitate viewing diff sections in your patch files, diff sections in "patch archive"
files, and the source files they reference. Patch archive files contain all the diff sections that
were applied to the previous version of a kernel to obtain a new version.

See the 'archives' folder in the examples for typical usage.

Helper
------

Helper objects facilitate use of PatchTools's tools, which have many required and
optional parameters. The *Helper* provides wrapper functions for PatchTools classes,
and some useful utility functions.

Command Summary
...............

Utility functions
~~~~~~~~~~~~~~~~~

load
    load JSON data into a variable from a file
save
    save a variable to a file as JSON string
read
    read list of strings from a file into a variable
write
    write a variable to a file as a list of strings

Wrapper functions
~~~~~~~~~~~~~~~~~

cmd
    run *Command* to execute shell command synchronously
find
    run *Finder* to find patterns in files
check
    run *Checker* to check patch file(s) against source tree
view
    display selected file[s]
vp2f
    display patch and files it uses
vf2p
    display file and patches that use it
vp2p
    display patch and other patches that use the same files
vp2a
    display patch and related patch archive diff sections
vcpf
    display *Checker* output file, one patch file and the files referenced by the patch file
walk
    run *Walker*  to generate a list of files for *Finder*, etc.
watch
    run *Watcher* to detect integration of patches into released kernels

The *Helper* constructor creates a 'defaults' item in the application's config data object,
using the values of 'sourcedir', 'patchdir', etc., found in the data, and subsequently
uses it in calls to wrapper functions where the caller does not provide a parameters object.

See the 'Helper'  section of the API documentation for futher info.


Configuration
=============

Many operations use configuration data that is loaded into a JSONConfig object during
initialization. Items specified in the configuration data include the location of the
source tree and of the patches directory. A description of the patch set may also be
stored there.

See the 'config.json' file for an example, and the 'JSONConfig' section of the
API documentation.


Exceptions
==========

PatchTools objects are intended to be embedded in Python scripts, which can have various
exception reporting and logging schemes. Consequently PatchTools classes generally do not
catch exceptions except to translate them to other exceptions. For example, JSONConfig objects
catch 'KeyError' exceptions generated by Python's json module, and map their line numbers to
those used in the source file, which may have different line numbers due to comment lines.

The exceptions.py file provides an ExceptionHandler class which can be used to print
exception reports.

PatchTools classes uniformly use exceptions to report errors, for example parameter errors,
while return values are used to deliver valid result data to the caller. The exceptions are defined
in exception.py.

See the example files for a simple exception handling scheme.


System Considerations
=====================

All modules encode file paths in Unix style using '/' characters.

The functions module determines if the Python version is 3.x when it is loaded.
The Python version can then be obtained by other modules by calling the Functions.is_python3 method.

If PatchTools is not installed in your Python installation, and you are not using Eclipse,
you must specify the PYTHONPATH environment variable to allow Python to find the PatchTools modules.
The easiest way to do this is to export the definition from your .profile file::

   export PYTHONPATH="$HOME/Projects/Eclipse/Linux/PatchTools"

Then you can run test programs from any location.

Eclipse-PyDev will define a PYTHONPATH variously according to the settings you choose when creating
your project. The PYTHONPATH value should include the parent folder of the 'patchtools' source
folder, as in the setting above.

Linux reportedly has adopted UTF-8 as the default text encoding, but some Linux kernel files contain
'Latin_1' characters whose numeric values are greater than 127, and are not valid UTF-8 start bytes.
Consequently the file access methods in the Functions class default to 'Latin_1' encoding.

Examples
========

This section shows typical usage of PatchTools classes and the *Helper* class.

See the folders in the 'examples' tree for the example code referenced below.

Example 1 -- Basic Features
---------------------------

This example shows the usage of many of the features described above.

Suppose we want to create a source file list for the *Finder* that enumerates selected files
in the kernel tree, but excludes the '.git' folder and any folders in 'arch' other than
'arch/arm'. This script will do the job::

    f = { "suffix" : [".c","h"], "prefix": ["Makefile","Kconfig"] }
    p = { "root_path" : c['sourcedir'], "incl_files" : f }
    excl = { "excl_dirs" : [".git","arch"] }
    incl = { "incl_dirs" : ["arch/arm"] }
    p1 = h.extend(p, incl)
    p2 = h.extend(p, excl)
    w = h.walk(p1) + walk(p2)
    h.save(w, c['logdir'] + "/src_files.txt")
    
Lines 1-4 create some dict variables to include in the parameters for the walk operations.

Lines 5 and 6 combine these dicts to make the final parameter dicts.

Line 7 executes the walk function on each parameter dict and combines their output.

Line 8 writes the data to a file for future use.

Running a script like this at the start of a project can substantially reduce the time required
for subsequent *Finder* operations.

Example 2 -- Viewer 1
---------------------

In this example we want to assess the difficulty of porting some patch changes to our kernel by displaying
a *Checker* output file, a kernel source file, and the patches that modify the source file::

    h.view(c['logdir'] + "/match_a.log")
    h.vf2p("arch/arm/boot/dts/am33xx.dtsi")
    
If the 'gedit' editor is used (the Linux default editor), all the files will be displayed in a single window,
in the following order for this example:

    * The source file
    * The patch files in patchset group order, which is presumably their commit order.
  
We observe that searching for 'am33xx.dtsi' at the top of the *Checker* output file takes us to the report
for patch "dma/0018...", which is the first patch file displayed by gedit.

The Viewer classes normally launch the editor in asynchronous (nowait) mode, which allows users to enter two commands
like the ones above without being blocked at the first command.

Example 3 -- Viewer 2
---------------------

This example shows how the wait/nowait feature of the Viewer classes could be used to display a patch file,
its corresponding *Checker* output file, and the files referenced by the patch, one at a time::

    p = c['patchdir'] + "/adc/0002-input-ti_am33x_tsc-Step-enable-bits-made-configurabl.patch"
    m = h.check(p)
    f = "matcher.txt"
    h.write(m, f)
    v = { "wait" : True }
    a = c['sourcedir'] + "/drivers/iio/adc/ti_am335x_adc.c"
    b = c['sourcedir'] + "/drivers/input/touchscreen/ti_am335x_tsc.c"
    c = c['sourcedir'] + "/drivers/mfd/ti_am335x_tscadc.c"
    d = c['sourcedir'] + "/include/linux/mfd/ti_am335x_tscadc.h"
    h.view([p,f,a],v)
    h.view([p,f,b],v)
    h.view([p,f,c],v)
    h.view([p,f,d],v)

In this scenario, each view command will block until its editor is closed.

Example 4 -- AM33XX Drivers
---------------------------

In this example we will identify all the drivers used to control the Beagle's AM335X processor by cross referencing
"compatible =" items in the Beagle's .dts files to ".compatible =" items in the "of_match" tables of the kernel source files.

First we will find the dts compatible items::

    d = "arch/arm/boot/dts/"
    l = [(d + s) for s in ["am33xx.dtsi","am335x-bone-common.dtsi", "am335x-boneblack.dts",
         "am33xx-clocks.dtsi", "tps65217.dtsi"]]
    f = h.find({ "substr" : ["compatible = "] }, { "root_path" : c['sourcedir'], "file_paths" : l })
    h.write(f, "test3.txt")

test3.txt requires some post processing, as it contains extraneous text and duplicate entries.
The target files may have a tab instead of a space between the '=' and the following '"" character.
The *Helper's* write method was used to save the list as '\\n' terminated strings, so this task can be
automated::

    strings = Strings(h.read("test3.tmp"))
    for index1 in range(len(strings)):
        string = strings[index1]
        index2 = string.find("compatible =")
        if (index2 != -1):
            string = string[index2 + len("compatible ="):]
            string = string.lstrip(' \t').rstrip(';')
            strings[index1] = string
    strings = strings.sort().unique()
    h.write(strings, "test4.tmp")

Next we will list the kernel source files that might contain the corresponding entries in their of_match tables::

    p = { "root_path" : c['sourcedir'], "incl_files" : { "suffix" : [".c"] } }
    p1 = h.extend(p, { "incl_dirs" : ["arch/arm"] })
    p2 = h.extend(p, { "excl_dirs" : ["arch", ".git", "Documentation", "staging", "samples", "tools" ] })
    w1 = h.walk(p1)
    w2 = h.walk(p2)
    h.save(w1+w2, "test5.txt")

In this code the output was saved as a JSON object to allow using it as pattern parameters for the
following find operation.          

Finally we will match the patterns in "test4.txt" to the source files::

    p = h.read("test4.txt") # read compatible = items
    f = h.load("test5.txt") # load candidate sources
    r = c['sourcedir']
    m = h.find({ "substr" : p },{ "root_path" : r ,"file_paths" : f })
    h.write(m, "test6.txt")

Again some editing of the output file "test6.txt" is needed to eliminate extraneous
text and duplicates::

    strings = h.read("test6.txt")
    strings = [string[:string.find(':')] for string in strings]
    strings = Strings(strings).sort().unique()
    h.write(strings, "am335x_drivers.lst")

The final list contains the names of 69 drivers. Of these 16 use dts terms we sought, but are related
to other devices, while 53 drivers are used to control the Beagle board.
The list can be saved and passed to the *Checker* as a 'targets' option, or to the *Watcher*
to be run whenever a new kernel version is released by kernel.org.

Example 5 -- DTS Study
----------------------

The ARM community, including Texas Instruments and the Beagle developers, have been making substantial
progress in the last year or two in adopting the "Device Tree" system for their products. It is possible that,
although they have submitted many patches related to this effort, not all the patches have been integrated
into our target kernel. In this example we will investigate the state of Beagle related .dts and .dtsi files
in the patches and the kernel.

Examination of the kernel's '/arch/arm/boot/dts' folder shows that there are several files related to our
target device, the BeagleBone Black:

    * am335x-boneblack.dts
    * am335x-bone-common.dtsi
    * am33xx.dtsi
    * am33xx-clocks.dtsi
    * tps65217.dtsi

We can find all the patches that touch these files with this *Helper* script::

    l = h.list_patches()
    p = { "substr" : ["am33xx.dtsi", "am335x-boneblack.dts", "am335x-bone-common.dtsi",
                      "tps65217.dtsi", "am33xx-clocks.dtsi"] }
    f = find(p, { "root_path" : c['patchdir'], "file_paths" : l, "format" : "files" })
    g = h.PatchSet(c['defaults']).sort_patches({ "patches" : f, "order" : "patchset" })
    h.save(g, "dts_patch_refs.lst")

The first and sixth lines show use of the *Helper's* utility functions, 'list_patches' and 'save'.

The fourth line shows use of one of the *Helper's* class wrapper functions, 'find'. The "files"
option tells the *Finder* to return only filenames, with no duplicates.

See the API sections for descriptions of the functions, classes and parameters used.

Now we have the file "dts_patch_refs.lst", which contains the names of 76 patches that touch the files.
From here we can use the *Helper*'s view commands to view the patches individually, and the files they use.
For example::

    h.vp2f("pm/0062-ARM-OMAP2-AM33XX-timer-Interchance-clkevt-and-clksrc.patch")

The files can also be compared against a *Checker* output file by using the 'vcpf' command::

    h.vcpf("check.log", "pm/0062-ARM-OMAP2-AM33XX-timer-Interchance-clkevt-and-clksrc.patch")
   
If the default editor 'gedit' is used, all the files for a single command will appear in one window.

Inspecting the files in this way could take considerable time, but this process can be accelerated
by using a *Helper* script like this::

   fl = h.load("dts_patch_refs.lst")
   cp = h.extend(c['defaults'], { "mode" : "complete", "find" : True })
   ep = { "wait" : True }
   pd = c['patchdir'] + '/'
   for patch in fl:
       l = h.check(patch, cp)
       h.write(l, test6.txt")
       h.vcpf("test6.txt", pd + patch, c['defaults'], ep)

After checking the 72 patches we find that:

    * 27 of the patches specify changes that are already in the kernel, so the patches can be ignored.
    * Another 28 of the patches relate to features we won't use, i.e. certain capes, so these patches can also be ignored.
    * One patch definitely needs to be fixed.
    * About 20 patches need fixing if we will use their features, e.g. AM335 reset control or TI's PM firmware.
    * One patch has an undesirable modification of generic kernel files to work around a device specific problem,
      and should be redone.

See the 'report.txt' file in the examples/alldrivers folder for details.

Similar investigations can be done for subsystems you may be interested in, e.g. dma, adc, gpio, etc.,
with comparable results.


Example 6 -- TSC/ADC Study
--------------------------

In this example we look into the appropriateness of patches that modify the touch screen control (TSC)
and analog-digital converter (ADC) control logic in the kernel and driver files. We will use the *Helper*
to facilitate the investigation.

First we find patches that relate to terms such as "tscadc","tsc" and "adc". But we use search terms
such as "adc.c" to exclude extraneous matches to words containing those terms::

   l = h.list_patches()
   p = { "substr" : ["tscadc.c","tscadc.h","tsc.c","tsc.h","adc.c","adc.h"] }
   f = h.find(p, { "root_path" : c['patchdir'], "file_paths" : l, "format" : "full" })
   h.save(f, "patch_refs.lst")
   
Inspection of "patch_refs.lst" shows these files are touched by the patches:
   (1) drivers/iio/adc/ti_am335x_adc.c
   (2) drivers/input/touchscreen/ti_am335x_tsc.c
   (3) drivers/mfd/ti_am335x_tscadc.c
   (4) include/linux/mfd/ti_am335x_tscadc.h
   (5) include/linux/input/ti_am335x_tsc.h
   (6) include/linux/clk-provider.h   

Next we make a list of the  patches that touch each file, without duplicates::

   l = h.list_patches()
   t = ["drivers/iio/adc/ti_am335x_adc.c",
        "drivers/input/touchscreen/ti_am335x_tsc.c",
        "drivers/mfd/ti_am335x_tscadc.c",
        "include/linux/mfd/ti_am335x_tscadc.h"
        "include/linux/input/ti_am335x_tsc.h",
        "include/linux/clk-provider.h"]
   p = { "root_path" : c['patchdir'], "file_paths" : l, "format" : "files" }  
   f = h.find({ "substr" : t }, p)
   f = [s[:s.find(':')] for s in f] # Extract patch name
   s = Strings(f).sort().unique()
   h.save(s, "patch_refs2.lst")
   
The "patch_refs2.lst" file now contains the names of 23 patches that touch the files. We can use
the *Checker* and *Viewer* to examine the patches and files::

    f = h.load("patch_refs2.lst")
    p = h.extend(c['defaults'], { "mode" : "complete", "find" : True })
    r = c['patchdir'] + '/'
    cf = "../test/test6.txt"
    for patch in f:
        m = h.check(patch, p)
        h.write(m, cf)
        h.vcpf(cf, r + patch, p, { "wait" : True })

The results have been summarized in the file "report.txt" in the examples/tsc_adc folder. See the Notes
section at the bottom of that file for an explanation of the codes used.

We see that although the majority of diff and hunk sections are marked "N/N" (not needed), a significant
minority are indicated to need fixing. Some of these only need to have their line numbers adjusted to fit
the target code, but others have more serious problems. In particular, certain features such as enhancing
interrupt logic and adding work queues have not been integrated into the target kernel drivers. In other cases,
the changes requested by some patches may have been obsoleted by other patches that were integrated into
the kernel.

The situation is more complicated than was the case in the DTS study, and more investigation is needed.
However we can drop further consideration of all the patch sections marked "N/N" or "N/U", and focus on the
remaining sections. The best approach may be to use the existing versions of the target files,
and only fix and apply the patch sections that affect our project.

Example 7 -- Cape Manager Study
-------------------------------

The BeagleBone board developers have devised a sytem to allow accessory boards that to be plugged into
a Beagle board using its GPIO connectors, and have supported the development of a kernel mode file
'capemgr.c' to provide some control of such boards. The patches that created and enhanced the file have
not been integrated into the target kernel, so the file is not found there. In this example, we
investigate the use of 'git am' command and the patch utility to recreate the file from the patches.

See the examples/capemgr folder for the code, and Appendix D - Checker vs. quilt vs 'git am'
for a discussion of the results. The generated file is found under the examples/capemgr/quilt folder.


Classes API
===========

Complete documentation of the API's of the *Helper* and PatchTools classes is provided below.

