fmII
Sat, Jul 19th home | browse | articles | contact | chat | submit | faq | newsletter | about | stats | scoop 06:58 UTC
in
Section
login «
register «
recover password «
[Project] add release | add branch | add screenshot | broken links | change owner | email subscribers | update project | update branch (urls) [Project]

 pysync - Default branch
Section: Unix

 

Added: Thu, Dec 7th 2000 10:10 UTC (7 years, 7 months ago) Updated: Mon, Oct 16th 2006 13:40 UTC (1 year, 9 months ago)


About:
Pysync has both a demonstration implementation of the rsync and related algorithms in pure Python, and a high speed librsync Python extension. The pure Python is not fast and is not optimized, however it does work and provides a simple implementation of the algorithm for reference and experimentation. It includes a combination of ideas taken from librsync, xdelta, and rsync. The librsync Python extension is less flexible and harder to understand, but is very fast.

Author:
abo [contact developer]

Rating:
8.35/10.00 (3 votes)

Tar/BZ2:
http://minkirri.apana.org.au/pub/python/pysync/pysync-2.24.tar.bz2
Changelog:
http://minkirri.apana.org.au/pub/python/pysync/ChangeLog
RPM package:
http://minkirri.apana.org.au/[..]pub/python/pysync/pysync-2.24-1.i386.rpm
Debian package:
http://minkirri.apana.org.au/[..]pub/python/pysync/pysync_2.24-1_i386.deb

Trove categories: [change]
[Development Status]  5 - Production/Stable
[Environment]  Console (Text Based)
[Intended Audience]  Developers
[License]  OSI Approved :: GNU Lesser General Public License (LGPL)
[Operating System]  OS Independent
[Programming Language]  Python
[Topic]  Communications :: File Sharing, Education, Software Development :: Libraries :: Python Modules, System :: Archiving :: Compression

Dependencies: [change]
No dependencies filed

 
Project admins: [change]
» abo (Owner)

» Rating: 8.35/10.00 (Rank N/A)
» Vitality: 0.00% (Rank 8993)
» Popularity: 1.19% (Rank 4755)

project statsdownload stats
(click to enlarge graphs)
   Record hits: 24,722
   URL hits: 7,088
   Subscribers: 16

Other projects from the same categories:
DoceboCMS
Glubs
Shoelacer
Homemade Dictionary
Yet another Linux FAQ

Users who subscribed to this project also subscribed to:
Modular Audio Recognition Framework
imgSeek
bonddb
Shoutcast
customerConnect


Add comment · Rate this project · Subscribe to new releases · Ignore this project · Email this project to a friend · Project record in XML

 Branches

Branch Version Last release License URLs
Default 2.24 17-Oct-2003 GNU Lesser General Public License (LGPL) Tar/BZ2 Changelog

 Comments

[»] Handles large archives?
by Peter Abrahamsen - Jan 5th 2004 13:19:20

I'm curious as to whether this tool has the same architectural limitation as rsync, that it must build a complete archive listing before it begins to transfer files. I have a system with about 6-7 million files, and even though the vast majority of them are --exclude'd out, and even though the system has 2GB of RAM, rsync runs out of memory. Does this program work the same way?

[reply] [top]


    [»] Re: Handles large archives?
    by abo - Jan 9th 2004 00:59:25


    > I'm curious as to whether this tool has
    > the same architectural limitation as
    > rsync, that it must build a complete
    > archive listing before it begins to
    > transfer files. ...

    pysync only implements the delta calculation and patch application, it does not include any directory walk or network transport stuff. So pysync doesn't have those limitations because it doesn't include that kind of functionality. Pysync could be used to implement something that does what you want without those limitations.

    Have a look at librsync, rdiff-backup, unison etc for possible other alternatives that might be closer to what you want.

    [reply] [top]


      [»] Re: Handles large archives?
      by Luke Kenneth Casson Leighton - Jun 6th 2005 15:39:39

      search on google.com for "python rsync". almost right at the top is someone implementing rsync in python, but he hasn't got round to doing the bits that this guy has. combine the two projects and you have a _complete_ implementation of rsync in python. i aim to investigate this project because i want to be able to offer different files "merged" into one single repository, depending on who connects to the rsync server :) i.e. i can back up several machines, but the config files will be different.... cool, huh? :)

      [reply] [top]


        [»] Re: Handles large archives?
        by abo - Jun 6th 2005 19:50:54


        > search on google.com for "python rsync".

        > almost right at the top is someone

        > implementing rsync in python, but he

        > hasn't got round to doing the bits that

        > this guy has.


        I think you are refering to rsync.py. I just looked at it. It doesn't implement the rsync algorithm or any network transport. It only copies the walk/filter/copy functionality of rsync.

        Another interesting project is zsync. This implements an inverse rsync algorithm and uses a normal http server for network transport. It has no walk/filter/copy functionality.

        For those who want to add walk/filter functionality to pysync, I have bits and pieces that might be useful;
        efnmatch.py rsync style extended fnmatch.
        dirscan.py rsync style include/exclude pattern directory scanning.
        ddiffutils.py efficient directory comparison walk generators.

        I haven't yet tied these together into a useful combination, but I should some day :-)

        [reply] [top]


[»] Version 2.24 release
by abo - Oct 17th 2003 09:37:22

Version 2.24 has been released to update pysync for the new librsync 0.9.6. Also includes some minor tweaks, including psyco support which gives a 33% speedup.

If anyone wants windows binaries, let me know and I'll build them.

[reply] [top]


[»] Version 2.16
by abo - Jun 24th 2002 19:19:17

Version 2.16 was a quick release to include the rollsum extension module. I am currently working on improving librsync and wanted to release pysync with the work I had completed on it thus far before leaving it for a little while.

It does not include inverse delta support yet, and the librsync extension is unchanged. This means the librsync incremental API and memory leak problems are still present. After I finish improving librsync I will address these problems.

[reply] [top]


[»] Windows installer also available.
by abo - May 3rd 2002 06:17:16

Now that the release is out, I see they didn't like me putting the windows .exe installer as an "OS X" package :-).

For those who want it, a windows installer for python 2.1 is available at the ftp site..

For those who are intensly curious, there is a development diary in the Software Working File also publicly visible.

[reply] [top]


[»] Release of version 2.7
by abo - May 2nd 2002 13:57:10

This release is a major milestone, including both an md4 sum extension module and a swig librsync extension.

Note that the API has changed a little for pysync to bring it more in line with rdiff. Both pysync.py and librsync.py can be used as drop-in replacements for rdiff, with the exception they use "rdelta" instead of "delta" as an option. This is to distinguish from the pysync.py alternative of "xdelta". Note that the file parameters have changed order!

The librsync wrapper supports the higer level file api, but the low-level API is currently faulty. I hope to have this fixed in the next release.

The other major change is use of distutils to build releases. This allows me to produce rpm's and windows installers. The source distribution comes as a unix tar.bz2, or a windows zip. Because windows does not usualy have support for autoconf and swig, the zip includes a pre-configured and swigged librsync. The tar.bz2 does not include librsync, so you will need to get it and the patch from the Sourceforge rproxy project.

[reply] [top]


[»] Comming soon: librsync wrapper sponsored by Accellion.
by abo - Apr 22nd 2002 01:05:20

I am currently working on making a Python extension for librsync (part of the rproxy project on sourceforge) to add to pysync. This work has been sponsored by Accellion.

This should be finished before the end of this week (2002-04-26). Those interested in tracking this development can do so in the pysync Software Working File.

[reply] [top]


[»] Adding Reverse Delta's
by abo - Jan 30th 2002 19:35:47

It looks like I will be adding reverse-delta support to this soon, as I have a need for it. This will allow client-side delta calculation, reducing the load on a server.

I have some neat ideas about how this could be implemented simply using inheritance from the forward rdelta class. Any interested comments/encoragement will spur me on to implement this sooner :-).

[reply] [top]


[»] Now stable.
by abo - Sep 24th 2001 21:00:33

I've just changed the status of this from alpha to production/stable, because it basicly is.

I haven't really used it enought to be 100% confident there are no bugs, but it's objective of being a python demonstration of the algo has been met.

There are some further things that could be done with it... using md4sums instead of md5sums, restructuring to allow reverse-patching, adding a python interface to librsync, and simplify it more. These are things I'm unlikely to do myself soon, but I'm wide open for patches, suggestions, whatever.

[reply] [top]


[»] The new zlib like API features
by abo - Mar 1st 2001 19:19:04

Release 1.2 introduced the new zlib-like API, allowing for incremental calculation of deltas and applying patches. The comments at the top of pysync.py explains it all;

# Low level API signature calculation
sig=calcsig(oldfile)

# Low level API rsync style incremental delta calc from sig and newdata
delta=rdeltaobj(sig)
# or for xdelta style incremental delta calc from oldfile and newdata
# delta=xdeltaobj(oldfile)
incdelta=delta.calcdelta(newdata)
:
incdelta=delta.flush()

# Low level API applying incremental delta to oldfile to get newdata
patch=patchobj(oldfile)
newdata=patch.calcpatch(incdelta)
:

The rdeltaobj.flush() method supports R_SYNC_FLUSH and R_FINISH flush modes that behave the same as their zlib equivalents. Next on the TODO list is incremental signature calculation, and further cleanups. Eventualy I plan to create a md4sum module and move the rolling checksum stuff into C code.

The performance has been marginaly hurt by this new API. Interestingly, the python profiler shows that most of the time is wasted performing string-copies when taking slices from input buffers, not actualy doing the rsync. This suggests that significant performance increases might be achievable by re-arranging things a bit, rather than moving python code into C.

I have also added a pysync-test.py script for thorough formal testing of pysync. It generates/reuses random test files that make pysync really work hard, verifying that it behaves as it should.

Incidentaly, release 1.2 also fixed a rather embarassing bug in release 0.9's adler32.py that corrupted the rolling checksums, resulting in heaps of missed matches. This caused serious bad performance and very large deltas.

[reply] [top]


    [»] Re: The new zlib like API features
    by damien morton - Mar 5th 2001 15:43:24


    >
    > The performance has been marginaly
    > hurt by this new API. Interestingly, the
    > python profiler shows that most of the
    > time is wasted performing string-copies
    > when taking slices from input buffers,
    > not actualy doing the rsync. This
    > suggests that significant performance
    > increases might be achievable by
    > re-arranging things a bit, rather than
    > moving python code into C.

    I dont know if this can help, but you can create read-only buffers which are views into other buffers.

    >>> a = buffer("the quick brown fox jumped over the lazy dog")
    >>> a
    <read-only buffer for 007D8908, ptr 007D891C, size 44 at 007DD720>
    >>> buffer(a, 5, 10)
    <read-only buffer for 007D8908, ptr 007D8921, size 10 at 007DF298>

    This can save some copying.



    [reply] [top]


      [»] Re: The new zlib like API features
      by abo - Mar 5th 2001 17:59:29


      > I dont know if this can help, but you
      > can create read-only buffers which are
      > views into other buffers.
      Yes, I discovered buffer() soon after I released 1.2 by accident (trying to decipher extended slices in 2.0 and found a reference to it in the docs). It does make a big difference, and I've already started experimenting with it. It also opens up a few implementation options that were closed before. I'll probably have a new slightly faster and simpler version out soon.

      [reply] [top]


        [»] Re: The new zlib like API features
        by abo - Mar 13th 2001 20:22:07

        The new version 1.7 now takes advantage of buffer(), plus a few other simplifications and optimisations. This version is approximately 33% faster than version 1.2. I have also properly implemented the xdelta style delta calculation using a fairly neat inheritance from the rdelta class. This gives optimal deltas but requires direct access to the original oldfile.

        [reply] [top]


[»] Where to from here?...
by abo - Dec 17th 2000 19:41:24

There have been a few downloads already so I figure at least some people have looked at this. I'd like some feedback on where to take it from here... Since this is such a small piece of code, there is nothing like a supporting webpage or development site. I'm tossing up whether to create a SourceForge project for it, or just post it as a code-snippet. It's so small I hardly feel it's worth it, but I guess bugtracking etc might be useful. In the mean time, email me with suggestions. I've already started working on cleaning up the api to be more like zlib, along the lines that rproxy's libhsync. I'll be releasing the new version soon. This should make it easier to use for real applications and provide a better reference api.

[reply] [top]


    [»] Re: Where to from here?...
    by abo - Mar 1st 2001 19:26:33

    Well, as people may have noticed, I've decided that for a project as small as pysync, the best solution is to make the freshmeat entry the official homepage. Please post comments, suggestions, bugreports, etc here or email them direct to me. If this starts to grow too big, I'll then consider something like sourceforge.

    [reply] [top]


[»] Very simple, can do things rsync and xdelta can't.
by abo - Dec 7th 2000 19:17:21

I hate to blow my own trumpet, but I thought I'd add a bit more info not really applicable for the description.

This is _really_ simple... it is only about 300 lines, and half of those are comments containing descriptions and observations. It should be dead easy for anyone to read, understand, and modify.

It also implements a breakdown of the rsync algorithm that you can't easily get from any of the current C based implementations, which means you can use it for things you can't use rsync or xdelta for. It's &quot;Usage&quot; says it all;

Usage:
pysync sig oldfile sigfile
... generates signature file sigfile from oldfile

pysync delta newfile sigfile diffile
... generates delta file diffile for newfile from sigfile

pysync apply oldfile diffile newfile
... applies delta file diffile to oldfile to generate newfile

[reply] [top]




© Copyright 2008 SourceForge, Inc., All Rights Reserved.
About freshmeat.net •  Privacy Statement •  Terms of Use •  Trademark Guidelines •  Advertise •  Contact Us • 
ThinkGeek •  Slashdot  •  ITMJ •  Linux.com •  NewsForge  •  SourceForge.net  •  Surveys •  Jobs •  PriceGrabber