Searching subversion history (full text)

Is there a way to perform a full text search of a subversion repository, including all the history?

For example, I’ve written a feature that I used somewhere, but then it wasn’t needed, so I svn rm’d the files, but now I need to find it again to use it for something else. The svn log probably says something like “removed unused stuff”, and there’s loads of checkins like that.

Edit 2016-04-15: Please note that what is asked here by the term “full text search”, is to search the actual diffs of the commit history, and not filenames and/or commit messages. I’m pointing this out because the author’s phrasing above does not reflect that very well – since in his example he might as well be only looking for a filename and/or commit message. Hence a lot of the svn log answers and comments.

  • How do I ignore a directory with SVN?
  • I need to un-fetch some revisions from git-svn
  • SVN merge trunk into branch deleting files
  • How local SVN server work?
  • “git svn clone” only clones trunk? (Expected to clone entire repository)
  • Is it possible to use SubGit on a subsection of an SVN repository?
  • SVN Rename directory, maintain file history
  • iPhone Xcode project.pbxproj + Subversion = code signing issue?
  • 16 Solutions collect form web for “Searching subversion history (full text)”

    git svn clone <svn url>
    
    git log -G<some regex>
    

    svn log in Apache Subversion 1.8 supports a new --search option. So you can search Subversion repository history log messages without using 3’rd party tools and scripts.

    svn log --search searches in author, date, log message text and list of changed paths.

    See SVNBook | svn log command-line reference.

    If you are running Windows have a look at SvnQuery. It maintains a full text index of local or remote repositories. Every document ever committed to a repository gets indexed. You can do google-like queries from a simple web interface.

    I’m using a small shellscript, but this only works for a single file. You can ofcourse combine this with find to include more files.

    #!/bin/bash
    for REV in `svn log $1 | grep ^r[0-9] | awk '{print $1}'`; do 
      svn cat $1 -r $REV | grep -q $2
      if [ $? -eq 0 ]; then 
        echo "$REV"
      fi 
    done
    

    If you really want to search everything, use the svnadmin dump command and grep through that.

    The best way that I’ve found to do this is with less:

    svn log –verbose | less

    Once less comes up with output, you can hit / to search, like VIM.

    Edit:

    According to the author, he wants to search more than just the messages and the file names. In which case you will be required to ghetto-hack it together with something like:

    svn diff -r0:HEAD | less
    

    You can also substitute grep or something else to do the searching for you. If you want to use this on a sub-directory of the repository, you will need to use svn log to discern the first revision in which that directory existed, and use that revision instead of 0.

    I have been looking for something similar. The best I have come up with is OpenGrok. I have not tried to implement it yet, but sounds promising.

    While not free, you might take a look at Fisheye from Atlassian, the same folks that bring you JIRA. It does full text search against SVN with many other useful features.

    http://www.atlassian.com/software/fisheye/

    svn log -v [repository] > somefile.log
    

    for diff you can use the --diff option

    svn log -v --diff [repository] > somefile.log
    

    then use vim or nano or whatever you like using, and do a search for what you’re looking for. You’ll find it pretty quickly.

    It’s not a fancy script or anything automated. But it works.

    I was looking for the same thing and found this:

    http://svn-search.sourceforge.net/

    I don’t have any experience with it, but SupoSE (open source, written in Java) is a tool designed to do exactly this.

    I just ran into this problem and

    svnadmin dump <repo location> |grep -i <search term>
    

    did the job for me. Returned the revision of the first occurrence and quoted the line I was looking for.

    svn log -l<commit limit> | grep -C<5 or more lines> <search message>

    I wrote this as a cygwin bash script to solve this problem.

    However it requires that the search term is currently within the filesystem file. For all the files that match the filesystem grep, an grep of all the svn diffs for that file are then performed. Not perfect, but should be good enough for most usage. Hope this helps.

    /usr/local/bin/svngrep

    #!/bin/bash
    # Usage: svngrep $regex @grep_args
    
    regex="$@"
    pattern=`echo $regex | perl -p -e 's/--?\S+//g; s/^\\s+//;'` # strip --args
    if [[ ! $regex ]]; then
        echo "Usage: svngrep \$regex @grep_args"
    else 
        for file in `grep -irl --no-messages --exclude=\*.tmp --exclude=\.svn $regex ./`;     do 
            revs="`svnrevisions $file`";
            for rev in $revs; do
                diff=`svn diff $file -r$[rev-1]:$rev \
                     --diff-cmd /usr/bin/diff -x "-Ew -U5 --strip-trailing-cr" 2> /dev/null`
                context=`echo "$diff" \
                     | grep -i --color=none   -U5 "^\(+\|-\).*$pattern" \
                     | grep -i --color=always -U5             $pattern  \
                     | grep -v '^+++\|^---\|^===\|^Index: ' \
                     `
                if [[ $context ]]; then
                    info=`echo "$diff" | grep '^+++\|^---'`
                    log=`svn log $file -r$rev`
                    #author=`svn info -r$rev | awk '/Last Changed Author:/ { print $4 }'`; 
    
                    echo "========================================================================"
                    echo "========================================================================"
                    echo "$log"
                    echo "$info"
                    echo "$context"
                    echo
                fi;
            done;
        done;
    fi
    

    /usr/local/bin/svnrevisions

    #!/bin/sh
    # Usage:  svnrevisions $file
    # Output: list of fully numeric svn revisions (without the r), one per line
    
    file="$@"
        svn log "$file" 2> /dev/null | awk '/^r[[:digit:]]+ \|/ { sub(/^r/,"",$1); print  $1 }'
    

    I usually do what Jack M says (use svn log –verbose) but I pipe to grep instead of less.

    In case you are trying to determine which revision is responsible for a specific line of code, you are probably looking for:

    svn blame
    

    Credit: original answer

    I came across this bash script, but I have not tried it.

    Git Baby is a git and github fan, let's start git clone.