Bazaar

Code review comment for lp:~gerard-/bzr/update

update
Merge into bzr.dev

Revision history for this message

John A Meinel (jameinel) wrote on 2010-02-05:

8 + a_file = file('u1/hosts', 'rt')
9 + text = a_file.read()
10 + a_file.close()
11 + self.assertEqual(text, '<<<<<<< TREE\nfirst offline change in u1=======\naltered in u2>>>>>>> MERGE-SOURCE\n')
12 +

^- I think this is better written as

self.assertFileEqual('u1/hosts', '<<<<<< ...')

If that doesn't work because of line endings (it may be done in 'b' instead of
't' mode), then at least use:

self.assertEqualDiff("""<<<<<<< TREE
first offline change in u1...
""", text)

We generally put 'expected' first, and assertEqualDiff gives nice output when
they aren't equal.

You have this a few times, and the lines are longer than 80 chars. (And some
other comments, etc are also >80 chars.)

I think this would do well with a clear statement of steps. Probably in a
comment somewhere. At least, it helped me sort it out. The old code did:

1) old_tip = self.branch.update() (go from branch.last_revision() =>
master_branch.last_revision(), return old last_revision())
2) last_rev = basis_revision for current working tree
3) revision = new master_branch.last_revision()
4) merge_inner(last_rev => revision)
5) set_parent_trees(revision + existing_merges)
6) base_rev_id = unique_lca(revision, old_tip)
7) merge_inner(base_rev_id => old_tip)

Which thinking about it, does seem a bit backwards. In the above analysis we
have these revisions at play

W wt.last_revision()
B wt.branch.last_revision()
M wt.branch.get_master_branch().last_revision()
P wt.get_parent_ids()[1:], things that claim to be merged into the WT
already

The goal is to get from W to M in the most logical procession, but then also
handle that some of W (or B) may not be in M, so they have to be re-introduced.
Actually, I think it is that revisions in B that are not in M need to be
re-introduced, but I'm not sure if we care about revisions in W that aren't
in B or M. (though we certainly do care about P.)

So the current code does:
merge_inner(W => M)
Change the workingtree directly into the new master revision
merge_inner(ancestor(W,B) => W)
And re-merge the changes in the Branch that were not present in the Master

The new code does:

1) [same] old_tip = self.branch.update()
2) [same] last_rev = basis_revision for current working tree
3) [same] revision = master_branch.last_revision()
4) if old_tip != revision
     merge_inner(ancestor(revision, old_tip) => old_tip)
   if conflicts => stop (mark old_tip as merged into current tree)
5) if not merged:
    merge_inner(last_rev => revision)
   else:
    merge_inner(ancestor(revision, last_rev) => revision)

So I think this is:
merge_inner(ancestor(M,B) => B)
merge_inner(ancestor(M,W) => M)

# Note: I think one thing that neither code handles is when some of the
# revisions in M are already part of P, and thus probably shouldn't be
# re-introduced

So what does that do with a real graph

    1
    |\
    2 3
    | |
  B-4 5-M
    |
    W

The old code would do:
apply the difference from W to M (revert 2,4, apply 3,5)
merge (1 => B) to M (apply 2,4)

Which should change the graph to look like:

   1
   |\
   3 2
   | |
MB-5 4
   |/
   W

Which is ~ what I would expect. It probably suffers from the problem that local
changes are going to be based on 4, and the W=>M step removes 4 until they are
re-introduced by the second merge. But it is still fairly likely to cause
conflicts. Imagine if 4 introduces a new file, and W has some uncommitted
changes to that file. Moving from 4 => 5 will cause those changes to not have a
home, and moving from 5 => 4+W will conflict again on those conflicts.

Starting from this:
    1
    |\
    2 3
    | |
  B-4 5-M
    |
    W
The new code changes it to
    apply the difference from anc(M,B)=1 => B (apply 2,4)
    apply the difference from anc(M,W)=1 => W (apply 3,5)

Step 1 here looks a bit like a no-op, because you are applying 2,4 to a tree
that already has them. At best it is a no-op, at worst if you have local
modifications, it could cause them to conflict. (You are re-applying changes
you already have.)

But the graph ends up as:
   1
   |\
   2 3
   | |
   4 5-MB
   |/
   W

Only we then swap the parents. So what I like about it is the second phase,
which is, essentially, apply the changes in the master to the working tree, and
then swap the parents.

The first merge seems wrong to me, though. I think

1) Maybe the check should be "if old_tip != last_rev". Which is checking to see
if W == B rather than checking to see if B == M.
    At least at first glance, that would mean that we wouldn't do the merge if
    we'd determined that the local branch is at the same revision as the
    working tree. (Thus you don't apply revs you already have.) though if you
    consider:

    1
    |\
  B-2 3
    | |
    4 5-M
    |
    W

In that case, even with this check you would be trying to re-apply 2. What you
really want (I think) is to revert 4, as in this graph the final result should
be equivalent to:

   1
   |\
   3 2
   | |\
MB-5 | 4
   |/
   W

And the changes in 4 have been removed from the WT.

Anyway, I need a bit more time to analyze this, and consider where we really
want to be. I do think that doing something where we merge from 'anc(M,W) => M'
is more correct than doing 'W => M', as that tries to revert local commits
while leaving uncommitted changes in place.

I'll try to get back to this, I have to stop working for now.

8	+ a_file = file('u1/hosts', 'rt')
9	+ text = a_file.read()
10	+ a_file.close()
11	+ self.assertEqual(text, '<<<<<<< TREE\nfirst offline change in u1=======\naltered in u2>>>>>>> MERGE-SOURCE\n')
12	+

^- I think this is better written as

self.assertFileEqual('u1/hosts', '<<<<<< ...')

If that doesn't work because of line endings (it may be done in 'b' instead of
't' mode), then at least use:

self.assertEqualDiff("""<<<<<<< TREE
first offline change in u1...
""", text)

We generally put 'expected' first, and assertEqualDiff gives nice output when
they aren't equal.

You have this a few times, and the lines are longer than 80 chars. (And some
other comments, etc are also >80 chars.)

I think this would do well with a clear statement of steps. Probably in a
comment somewhere. At least, it helped me sort it out. The old code did:

1) old_tip = self.branch.update() (go from branch.last_revision() =>
   master_branch.last_revision(), return old last_revision())
2) last_rev = basis_revision for current working tree
3) revision = new master_branch.last_revision()
4) merge_inner(last_rev => revision)
5) set_parent_trees(revision + existing_merges)
6) base_rev_id = unique_lca(revision, old_tip)
7) merge_inner(base_rev_id => old_tip)

Which thinking about it, does seem a bit backwards. In the above analysis we
have these revisions at play

W   wt.last_revision()
B   wt.branch.last_revision()
M   wt.branch.get_master_branch().last_revision()
P   wt.get_parent_ids()[1:], things that claim to be merged into the WT
    already

So the current code does:
merge_inner(W => M)
    Change the workingtree directly into the new master revision
merge_inner(ancestor(W,B) => W)
    And re-merge the changes in the Branch that were not present in the Master

The new code does:

So I think this is:
merge_inner(ancestor(M,B) => B)
merge_inner(ancestor(M,W) => M)

# Note: I think one thing that neither code handles is when some of the
#       revisions in M are already part of P, and thus probably shouldn't be
#       re-introduced

So what does that do with a real graph

1
    |\
    2 3
    | |
  B-4 5-M
    |
    W

The old code would do:
    apply the difference from W to M (revert 2,4, apply 3,5)
    merge (1 => B) to M (apply 2,4)

Which should change the graph to look like:

1
   |\
   3 2
   | |
MB-5 4
   |/
   W

But the graph ends up as:
   1
   |\
   2 3
   | |
   4 5-MB
   |/
   W

Only we then swap the parents. So what I like about it is the second phase,
which is, essentially, apply the changes in the master to the working tree, and
then swap the parents.

The first merge seems wrong to me, though. I think

1
    |\
  B-2 3
    | |
    4 5-M
    |
    W

In that case, even with this check you would be trying to re-apply 2. What you
really want (I think) is to revert 4, as in this graph the final result should
be equivalent to:

1
   |\
   3 2
   | |\
MB-5 | 4
   |/
   W

And the changes in 4 have been removed from the WT.

I'll try to get back to this, I have to stop working for now.

review: Needs Information

« Back to merge proposal