Livegrep is a tool, partially inspired by Google Code Search, for interactive regex search of ~gigabyte-scale source repositories. You can see a running instance at http://livegrep.com/.
livegrep builds using bazel. You will need to
install with a version matching that in
Running bazel via bazelisk will download the right version
livegrep vendors and/or fetches all of its dependencies using
and so should only require a relatively recent C++ compiler to build.
Once you have those dependencies, you can build using
bazel build //...
Note that the initial build will download around 100M of dependencies. These will be cached once downloaded.
livegrep, you need to invoke both the
index/search process, and the
livegrep web interface.
To run the sample web interface over livegrep itself, once you have
In one terminal, start the
codesearch server like so:
bazel-bin/src/tools/codesearch -grpc localhost:9999 doc/examples/livegrep/index.json
In another, run livegrep:
In a browser, now visit http://localhost:8910/, and you should see a working livegrep.
codesearch binary is responsible for reading source code,
maintaining an index, and handling searches.
livegrep is stateless
and relies only on the connection to
codesearch over a TCP
codesearch will build an in-memory index over the
repositories specified in its configuration file. You can, however,
also instruct it to save the index to a file on disk. This has the dual
advantages of allowing indexes that are too large to fit in RAM, and
of allowing an index file to be reused. You instruct
generate an index file via the
-dump_index flag and to not launch
a search server via the
bazel-bin/src/tools/codesearch -index_only -dump_index livegrep.idx doc/examples/livegrep/index.json
codeseach has built the index, this index file can be used for
future runs. Index files are standalone, and you no longer need access
to the source code repositories, or even a configuration file, once an
index has been built. You can just launch a search server like so:
bazel-bin/src/tools/codesearch -load_index livegrep.idx -grpc localhost:9999
The schema for the
codesearch configuration file defined using
protobuf in src/proto/config.proto.
livegrep frontend accepts an optional position argument
indicating a JSON configuration file; See
doc/examples/livegrep/server.json for an example, and
server/config/config.go for documentation of available
livegrep will connect to a single local codesearch
instance on port
9999, and listen for HTTP connections on port
livegrep includes a helper driver,
can automatically update and index selected github repositories. To
download and index all of my repositories (except for forks), storing
the repos in
repos/ and writing
nelhage.idx, you might run:
bazel-bin/cmd/livegrep-github-reindex/livegrep-github-reindex -user=nelhage -forks=false -name=github.com/nelhage -out nelhage.idx
You can now use
nelhage.idx as an argument to
livegrep provides the ability to view source files directly in
an alternative to linking files to external viewers. This was initially implemented
by @jboning here. There are
a few ways to enable this. The most important steps are to
livegrepcan use to figure out where your source files are (locally).
See doc/examples/livegrep/server.json for an
example config file, and server/config/config.go for documentation on available options. To enable the file viewer, you must include an
IndexConfig block inside of the config file. An example
IndexConfig block can be seen at doc/examples/livegrep/index.json.
Tip: For each repository included in your
IndexConfig, make sure to include
metadata.url_pattern if you would like the file viewer to be able to link out to the external host. You'll see a warning in your browser console if you don't do this.
If you are already using the
livegrep-github-reindex tool, an IndexConfig index file is generated for you, by default named "livegrep.json".
Run the indexer
bazel-bin/cmd/livegrep-github-reindex/livegrep-github-reindex_/livegrep-github-reindex -user=xvandish -forks=false -name=github.com/xvandish -out xvandish.idx ```
The indexer will have done these main things:
Here's an abbreviated version of what your directory might look like after running the indexer.
livegrep │ xvandish.idx └───repos │ │ livegrep.json │ └───xvandish │ └───repo1 │ └───repo2 │ └───repo3
Now that you generated an index file, it's time to run livegrep with it.
Run the backend:
bazel-bin/src/tools/codesearch -load_index xvandish.idx -grpc localhost:9999
Run the frontend in another shell instance with the path to the index file located at
bazel-bin/cmd/livegrep/livegrep_/livegrep -index-config ./repos/livegrep.json
In a browser, now visit
http://localhost:8910 and you should see a working
livegrep. Search for something, and once you get a result, click on the file
name or a line number. You should now be taken to the file browser!
Livegrep's CI builds Docker images into the livegrep
organization docker repository on every merge to
should be generally usable. For instance, to build+run a livegrep
index of this repository, you could run:
docker run -v $(pwd):/data ghcr.io/livegrep/livegrep/indexer /livegrep/bin/livegrep-github-reindex -repo livegrep/livegrep -http -dir /data docker network create livegrep docker run -d --rm -v $(pwd):/data --network livegrep --name livegrep-backend ghcr.io/livegrep/livegrep/base /livegrep/bin/codesearch -load_index /data/livegrep.idx -grpc 0.0.0.0:9999 docker run -d --rm --network livegrep --publish 8910:8910 ghcr.io/livegrep/livegrep/base /livegrep/bin/livegrep -docroot /livegrep/web -listen=0.0.0.0:8910 --connect livegrep-backend:9999
And then access http://localhost:8910/
You can also find the docker-compose config powering
livegrep.com in the
livegrep builds an index file of your source code, and then works entirely out of that index, with no further access to the original git repositories.
The index file will vary somewhat in size, but will usually be 3-5x
the size of the indexed text.
livegrep memory-maps the index file
into RAM, so it can work out of index files larger than (available)
RAM, but will perform better if the file can be loaded entirely into
memory. Barring that, keeping the disk on fast SSDs is recommended for
Livegrep uses Google's re2 regular expression engine, and inherits its supported syntax.
RE2 is mostly PCRE-compatible, but with some mostly-deliberate exceptions
Livegrep is open source. See COPYING for more information.