I'm working on a research project using Eclipse CDT and I'd like your help with the following:
Context: Assume I have a library myLib.h with a set of classes, functions, methods, variables etc and a (big) single source file main.cpp that includes myLib.h and uses some of the elements from myLib.h.
Goal: I want to identify all the elements (classes, functions, methods, variables) from myLib.h that are used in main.cpp.
Status: If the all the elements in the header file are included in namespaces, I can get these namespaces by preprocessing the header. When I analyse the AST of main.ccp I can get the scope of the AST elements (after binding resolution) and check whether they are included in the namespaces of interest (those in myLib.h).
Problem: However, this does not work if myLib.h has no namespaces (I know this is not a recommended practice, but I encountered this problem when I tried to analyse big open source projects). How can I resolve this issue?
Given an IBinding for a name in the cpp file, you can call IIndex.findDeclarations(IBinding) to get an array of IIndexName objects representing the declarations of the binding (the index can be obtained via IASTTranslationUnit.getIndex()).
Once you have an IIndexName, you can call getFile() on it to get an IIndexFile, then getLocation() on that to get an IIndexFileLocation. IIndexFileLocation has methods that you should be able to use to identify your header file, such as getFullPath().
> E.g., if I have a header with a Document class that has a Load() method
> and a source with Document::Load(), I expect to get both index names, right?
I would expect so, yes.
> This does not happen always...
If you're able to share a reproducing testcase (all relevant files), and some details about how you are setting up the environment (e.g. a plugin?) from which you are invoking this code, I would be happy to try to reproduce the issue, and if I'm able to reproduce it, look into why it's not working.
My question is how CDT achieves binding resolution while writing code (and auto completion).
I am asking this because I am doing something similar for a research project, i.e., I want to identify library usages (classes, functions, variables) in a codebase. I solve this by generating the ASTs and use the class/function/variable bindings to check if any binding (definition/declaration) comes from the library headers.
I want to find out if there is a better way to do it.
For instance, assume I have a Doc.h header with a Document class that has a Load() method. My main.cpp includes Doc.h. I suppose CDT uses the index to find all relevant CElements (CompositeType, Function, Variable, Using etc) and provide auto-completion or show errors if an incompatible method/variable is used from Doc.h.
> My question is how CDT achieves binding resolution while
> writing code (and auto completion).
CDT has a parser which parses C++ code into an AST, and
semantic analysis code which runs on that AST to do things
like connect names to the entities ("bindings") they name,
much like a compiler's front-end would.
CDT also maintains a project-wide (in fact, workspace-wide
in the case of dependent projects) queryable database of
semantic information about the code, called the "index". The
index is built by constructing an AST and performing
semantic analysis for each file in the project, and traversing
the analyzed AST to populate the index with information.
The semantic analysis is designed in such a way that it can
consume information from the index, and this is leveraged to
avoid parsing header files once for every file they are
#included in. So, when parsing a file, only code located in the
file directly (as opposed to code included via #includes) is
parsed, and semantic analysis for the file gets information
about the included code from the index.
Let me know if that answers your question; if you have more
specific questions, I'm happy to try to answer them.
> I am asking this because I am doing something similar for
> a research project, i.e., I want to identify library usages
> (classes, functions, variables) in a codebase. I solve this
> by generating the ASTs and use the class/function/variable
> bindings to check if any binding (definition/declaration)
> comes from the library headers.
> I want to find out if there is a better way to do it.
Depending on your needs, you may be able to obtain the
information you need from the index, and thereby avoid
parsing the files in the project again (they are parsed once
while building the index).
Here's an outline of how that might be done.
- Use IIndex.getAllFiles() to get a list of all files about
which information is stored in the index.
- Identify which files belong to the library (for example,
by examining their filenames/paths).
- For each file in the library, use IIndexFile.findNames()
to get a list of all names in the file.
- Filter the list of names to those for which
IName.isDefinition() is true (i.e. those which are
- Use IIndex.findBinding(IName) to get the binding
being defined by the name.
- Use IIndex.findReferences(IBinding) to get a list of
all references to that binding in the project.
I'm not sure if these steps are the most efficient way to
obtain this information from the index. It seems like using
IPDOMVisitor might be more efficient, although I don't see
us exposing a way to use that via public API (though
maybe I'm overlooking something).