summaryrefslogtreecommitdiff
path: root/community/weblogs/tschwinge/2009-06-24_importing_from_gnu_arch_into_git.mdwn
blob: f397e75b852bb1fd18efd6a484835003c11c5d27 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!meta title="converting from GNU arch to Git -- without direct repository
access"]]

I wanted to import an old GNU arch repository into Git, but only had HTTP
access via ArchZoom.  I spent quite some time to try teaching `git archimport`
to use HTTP access to that repository, but this didn't work out.  Too bad --
but at least, using ArchZoom, I was able to get the individual revisions'
tarballs:

    $ ls -1 *.tar.gz
    bpf--devel--0.0--base-0.tar.gz
    bpf--devel--0.0--patch-1.tar.gz
    bpf--devel--0.0--patch-10.tar.gz
    bpf--devel--0.0--patch-11.tar.gz
    bpf--devel--0.0--patch-12.tar.gz
    bpf--devel--0.0--patch-2.tar.gz
    bpf--devel--0.0--patch-3.tar.gz
    [...]
    bpf--devel--0.0--patch-9.tar.gz
    bpf--release--0.1--base-0.tar.gz
    bpf--release--0.1--patch-1.tar.gz
    bpf--release--0.1--patch-2.tar.gz
    [...]
    bpf--release--0.1--patch-8.tar.gz

I unpacked these:

    $ for f in *.tar.gz; do tar -xz < "$f" || echo >&2 "$f" failed; done

The last revision's tree apparently contains all previous revisions' commit
information (author, date, message), so use that:

    $ cp -a ↩
        bpf--release--0.1--patch-8/{arch}/bpf/bpf--devel/bpf--devel--0.0/info@hurdfr.org--hurdfr/patch-log ↩
        d-patch-log
    $ cp -a ↩
        bpf--release--0.1--patch-8/{arch}/bpf/bpf--release/bpf--release--0.1/info@hurdfr.org--hurdfr/patch-log ↩
        r-patch-log

... and extract the information that we need:

    $ base=bpf--devel--0.0-- && ↩
      for f in d-patch-log/*; do ↩
        grep < "$f" ^Creator: | head -n 1 ↩
          | { read j c && ↩
              echo "$c" | sed s%' <.*'%% ↩
                > "$base""$(basename "$f")".author_name && ↩
              echo "$c" | sed -e 's%.*<%%' -e 's%>.*%%' ↩
                > "$base""$(basename "$f")".author_email; } && ↩
        grep < "$f" ^Standard-date: | head -n 1 | { read j d && echo "$d" ↩
          > "$base""$(basename "$f")".author_date; } && ↩
        { grep < "$f" ^Summary: | head -n 1 | { read j m && echo "$m"; } && ↩
          echo && sed < "$f" '1,/^$/d'; } ↩
          > "$base""$(basename "$f")".log ↩
        || echo >&2 "$f" failed; ↩
      done
    $ base=bpf--release--0.1-- && ↩
      for f in r-patch-log/*; [...]

(Of course, I could have used something more elaborate than shell scripting...)

Remove the GNU arch stuff that we don't need anymore:

    $ find bpf--*/ -type d \( -name {arch} -o -name .arch-ids \) -print0 ↩
        | xargs -0 rm -r

The `base-0` revisions are actually either empty (the `devel` one) or
equivalent to the previous revision (the `release` one), so remove these:

    $ rm -rf bpf--devel--0.0--base-0 bpf--release--0.1--base-0

Finally, import all the other ones:

    $ mkdir g && ( cd g/ && git init )
    $ for d in bpf--d*-? bpf--d*-?? bpf--r*; do ↩
        test -d "$d"/ || continue && ↩
        ( cd g/ && ↩
          rsync -a --delete --exclude=/.git ../"$d"/ ./ && ↩
          git add . && ↩
          GIT_AUTHOR_NAME="$(cat ../"$d".author_name)" ↩
            GIT_AUTHOR_EMAIL="$(cat ../"$d".author_email)" ↩
            GIT_AUTHOR_DATE="$(cat ../"$d".author_date)" ↩
            git commit -F ../"$d".log -a ); ↩
      done

Voilà!


**Update 2009-06-25:**

Half a day later, [[HurdFr]] published a `git archimport`-converted repository
-- which was *identical* to my hand-crafted one (apart from having
`git-archimport-id:` tags in the commit messages, and the first (empty) commit
not being stripped off).  :-)