git-p4 migrate branches in different subdirectories

I want to migrate source code tree from perforce to git. The source code contains several dev branches scattered across perforce depot, not necessarily in the same directory. for example the structure is something like this –

//depot/dev/project/master 
//depot/dev/project/branch1 
//depot/dev/project/branch2
//depot/dev/sub-project/branch3 
//depot/dev/sub-project/branch4 
//depot/patch-project/branch5 
//depot/patch-project/special/developern/branch6 

I went though git-p4 documentation https://git-scm.com/docs/git-p4 BRANCH DETECTION section and also similar articles http://forums.perforce.com/index.php?/topic/1395-git-p4-and-multiple-branches/.

  • How do you remove an invalid remote branch reference from Git?
  • GIT: commit changes to old/safe branch while in new/dirty/dev branch without checking out or losing unstaged data
  • Creating aliases for Git branch names
  • git: removing strange remote branch
  • git change origin of a branch (rebase)
  • Git push everything to new origin
  • I am able to migrate branches with history for those which are under immediate parent like

     //depot/dev/project/branch1 and 
     //depot/dev/project/branch2 
    

    What I am not able to achieve is how can I migrate all six branches together at once.

    I tried running the migration on //depot@all level after specifying the branch specs, however it is failing since perforce server is huge, it gives either maxresults exception or session timeout. Can somebody please guide how this scenario can be handled?

    Another option I see is to migrate branches separately (one branch to one git repo) and then merge those all branches into a new git repo. I am not sure if doing this what will be impact/downside.

    Thanks and Regards,
    Amar Kumbhar.
    

  • What do I gain by using git lfs?
  • Storing Jenkins configuration for multiple environments
  • git config not used when git pull via PHP file
  • Automatically reword all rebased commits
  • git merge fails with “Untracked working tree file” on case change
  • Proper way to avoid asking for password on every pull on production server
  • 3 Solutions collect form web for “git-p4 migrate branches in different subdirectories”

    Summary: It works, git-p4 is a great tool, very intelligent, comes with lot of configurable options. Multiple branches scattered wherever across depot tree migrated successfully. We need to run the import at highest level (topmost) perforce directory that covers all sub-directories or branches of interest. For efficient operation, suggested to use –changesfile option, to explicitly specify changelists to be imported. Also use git-p4.branchUser and git-p4.branchList to explicitly specify branchspecs.

    Details: Here I am showing the settings that worked for me. There may be a better way to achieve the goal.

    Perforce depot structure: (as mentioned in question)

    Perforce client: This is set at highest (topmost) p4 directory. This is very important, otherwise git-p4 may exclude changelists (restricted due to client view) as empty commits.

       //depot/... //myp4client/...
    

    Perforce branchspecs: I created a single branchspec that covers all my branches dependency (parent/child) information

    $ p4 branch -o test1 | grep "//"
    
        //depot/dev/project/master/... //depot/dev/project/branch1/...
        //depot/dev/project/master/... //depot/dev/project/branch2/...
        //depot/dev/project/branch1/... //depot/dev/sub-project/branch3/...
        //depot/dev/project/branch1/... //depot/dev/sub-project/branch4/...
        //depot/dev/project/master/... //depot/patch-project/branch5/...
        //depot/patch-project/branch5/... //depot/patch-project/special/developern/branch6
    

    git-p4 config items: Next, I setup an empty git repository and following config items.

     mkdir workdir
     cd workdir
     git init
    

    (** perforce variables)

    git config git-p4.user myp4user
    git config git-p4.passwowrd myp4password
    git config git-p4.port myp4port
    git config git-p4.client myp4client
    

    (** force to use perforce client spec)

    git config git-p4.useClientSpec true
    git config git-p4.client myp4client
    

    ( ** restrict to explore branchspecs created only by me)

    git config git-p4.branchUser myp4user
    

    ( ** branch information, dependency relation, interestingly only last name (directory name in branch path) is required to mention, git-p4 automatically detects/pick what is required i.e. fully expanding the branch name )

    git config git-p4.branchList master:branch1
    git config --add git-p4.branchList master:branch2
    git config --add git-p4.branchList branch1:branch3
    git config --add git-p4.branchList branch1:branch4
    git config --add git-p4.branchList master:branch5
    git config --add git-p4.branchList branch5:branch6
    

    Changelists file: Next, I collected all the changelists, for all branches those I am migrating.

    p4 changes //depot/dev/project/master/...  | cut -d' ' -f2 >> master.txt
    p4 changes //depot/dev/project/branch1/...  | cut -d' ' -f2 >> master.txt
    p4 changes //depot/dev/project/branch2/...  | cut -d' ' -f2 >> master.txt
    p4 changes //depot/dev/sub-project/branch3/...  | cut -d' ' -f2 >> master.txt
    p4 changes //depot/dev/sub-project/branch4/...  | cut -d' ' -f2 >> master.txt
    p4 changes //depot/patch-project/branch5/...  | cut -d' ' -f2 >> master.txt
    p4 changes //depot/patch-project/special/developern/branch6/...  | cut -d' ' -f2 >> master.txt
    
    sort -n master.txt | uniq > master_sorted.txt
    

    Import: Finally I ran the import as below, I used “sync” and not clone.

    cd workdir 
    ../git-p4.py sync //depot/... --detect-branches --verbose --changesfile /home/myp4user/master_sorted.txt
    

    On smaller depots “ ../git-p4.py sync //depot@all –detect-branches –verbose “ shall also work, in that case no need to create changelists file (earlier step)

    Once import is finished, I am able to see git-p4 created all remote perforce branches inside single git repository.

     git branch -a
      remotes/p4/depot/dev/project/master
      remotes/p4/depot/dev/project/branch1
      remotes/p4/depot/dev/dev/project/branch2
      remotes/p4/depot/dev/dev/sub-project/branch3
      remotes/p4/depot/dev/dev/sub-project/branch4
      remotes/p4/depot/patch-project/branch5
      remotes/p4/depot/patch-project/special/developern/branch6
    

    Then I created local branches from remote p4 branches

      git checkout -b master  remotes/p4/depot/dev/project/master
      git checkout -b branch1  remotes/p4/depot/dev/project/branch1
      git checkout -b branch2   remotes/p4/depot/dev/dev/project/branch2
      git checkout -b branch3   remotes/p4/depot/dev/dev/sub-project/branch3
      git checkout -b branch4   remotes/p4/depot/dev/dev/sub-project/branch4
      git checkout -b branch5   remotes/p4/depot/patch-project/branch5
      git checkout -b branch6   remotes/p4/depot/patch-project/special/developern/branch6
    

    Next I simply added a remote origin and pushed the code into git repo.
    Thanks for various pointers/help available in stackoverflow and online.

    The latest version of git-p4 should not report maxresults exceptions, given that it will retrieve a maximum of 500 changes at a time. You can try modifying this value using the --changes-block-size argument, which might help you overcoming the problem you reported.

    Here’s the description of this argument, as can be seen here:

    --changes-block-size <n>::
        The internal block size to use when converting a revision
        specifier such as '@all' into a list of specific change
        numbers. Instead of using a single call to 'p4 changes' to
        find the full list of changes for the conversion, there are a
        sequence of calls to 'p4 changes -m', each of which requests
        one block of changes of the given size. The default block size
        is 500, which should usually be suitable.
    

    I faced a similar problem before and in my case I had to find a workaround which is the one you described, I cloned each branch into it’s own Git repo.

    Even then individual branches had too many objects and the P4 admins were not willing to up the limit for requested objects so I had to also limit the amount of history I cloned.

    For inactive branches I would only clone the latest version and discard all its history.

    For more active branches I would keep from 4-6 weeks of history.

    This means you have to manually figure out what the CL numbers are for the amount of history you want to keep:

    From git-p4 docs (section: DEPOT PATH SYNTAX):

    $ git p4 clone //depot/my/project@1,6 # Import only changes 1 through 6.

    Git Baby is a git and github fan, let's start git clone.