Cleanup strategy for Release repository

Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Cleanup strategy for Release repository

Bruno Bonacci

Hi guys,
there is any way to clean up the Release repository from old artifacts?
We deploy in Nexus, not only .jar and .war, but also tar.gz and .rpm (required by our deployment process).
We have a two weeks iteration, and every iteration we release most of our packages.
The consequence is that every two weeks we push in Nexus ~200Mb of artefacts.
The repository is growing very fast, and although there is a task to clean up the old snapshot releases, I haven't found anything for the Release repository. Artifacts older than two iterations are useless because they have been already replaced at least twice.

Does anyone have any suggestion how to cleanup 'automatically' the Release repository?
bye
Bruno
Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

Tamás Cservenák
Hi there,

In general, Maven "release" repositories are "carved in stone". What is there is to be left _forever_. So, notion of "old artifact" is undefined in "maven world". Hence, Nexus does not offer anything to support use cases like these out of the box.

How do you solve build reproducibility this way? Or you don't need to rebuild anything that is "older than two iterations"?

If so, you may consider:

a) using some sort for snapshot repository and deploy snapshots in some controlled manner.
b) divert deployments of those "transient" release artifacts of yours into some "transient" repositories, that is getting created and added to some group at every begin of new iteration, and removed (hence content deleted) at the end or after two weeks of the end of the iteration. Kinda "caterpillar" way. And the group would always offer you the "current" view of all the contents you need.

The "non-transient" releases should be done in "usual" way and should remain "carved in stone".

But again, there are multiple problems with this approach if you use Maven: release artifacts "popping up" and "disappearing" is no-no. Maven will once download them, and will _never_ check for remote change again. This means, you will be left with the burden to maintain your local reposes too (eg. on CI systems, by regularly nuking local repository).




Thanks,
~t~


On Thu, Jul 8, 2010 at 1:41 PM, Bruno Bonacci <[hidden email]> wrote:


Hi guys,
there is any way to clean up the Release repository from old artifacts?
We deploy in Nexus, not only .jar and .war, but also tar.gz and .rpm
(required by our deployment process).
We have a two weeks iteration, and every iteration we release most of our
packages.
The consequence is that every two weeks we push in Nexus ~200Mb of
artefacts.
The repository is growing very fast, and although there is a task to clean
up the old snapshot releases, I haven't found anything for the Release
repository. Artifacts older than two iterations are useless because they
have been already replaced at least twice.

Does anyone have any suggestion how to cleanup 'automatically' the Release
repository?
bye
Bruno
--
View this message in context: http://maven.40175.n5.nabble.com/Cleanup-strategy-for-Release-repository-tp1044998p1044998.html
Sent from the Nexus Maven Repository Manager Users List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

Stephen Chamberlain (Collis)
In reply to this post by Bruno Bonacci
Hi Bruno,

I don't think Nexus has a mechanism for this. The philosophy is that snapshot releases represent development builds (i.e. non-released software). This means they can be cleaned up regularly as release versions are deployed. If Nexus were to auto remove release artifacts, you could not guarantee to be able to rebuild a released version of your software (unless you check out your code at a tagged version, rebuild and redeploy, etc).

Perhaps you could consider deploying more snapshot artifacts; do you really make production releases every two weeks? Or are these versions for internal review/testing?

With kind regards,

Stephen Chamberlain
Senior Software Developer/ScrumMaster



Collis BV
De Heyderweg  1
2314 XZ  Leiden
The Netherlands

T: + 31 715813636
F: + 31 715813630
LinkedIn: http://nl.linkedin.com/in/stevechamberlain

www.collis.nl



This e-mail message is confidential and may be protected by legal privilege. If you are not the intended recipient, any disclosure, distribution or forwarding, copying or printing of this message is strictly prohibited. If you receive this message in error please return it to the sender, and delete your copy from your system.

-----Original Message-----
From: Bruno Bonacci [mailto:[hidden email]]
Sent: 08 July 2010 13:41
To: [hidden email]
Subject: [nexus-user] Cleanup strategy for Release repository



Hi guys,
there is any way to clean up the Release repository from old artifacts?
We deploy in Nexus, not only .jar and .war, but also tar.gz and .rpm
(required by our deployment process).
We have a two weeks iteration, and every iteration we release most of our
packages.
The consequence is that every two weeks we push in Nexus ~200Mb of
artefacts.
The repository is growing very fast, and although there is a task to clean
up the old snapshot releases, I haven't found anything for the Release
repository. Artifacts older than two iterations are useless because they
have been already replaced at least twice.

Does anyone have any suggestion how to cleanup 'automatically' the Release
repository?
bye
Bruno
--
View this message in context: http://maven.40175.n5.nabble.com/Cleanup-strategy-for-Release-repository-tp1044998p1044998.html
Sent from the Nexus Maven Repository Manager Users List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

Tim O'brien
In reply to this post by Bruno Bonacci
Support for this isn't built-in for the reasons that Tamas alluded to.

Despite that, I can see where this would become a necessity.  In these
cases, you can create a cron job to delete artifacts older than a
certain age and then kick off a reindex for the repository you need to
clear out.

I've done this a few times myself.  Example, I use Nexus to store book
content, and I don't need reproducibility.   While I could store
historical releases from 2008 and 2009, it would serve no reasonable
purpose.

On Thu, Jul 8, 2010 at 6:41 AM, Bruno Bonacci <[hidden email]> wrote:

>
>
> Hi guys,
> there is any way to clean up the Release repository from old artifacts?
> We deploy in Nexus, not only .jar and .war, but also tar.gz and .rpm
> (required by our deployment process).
> We have a two weeks iteration, and every iteration we release most of our
> packages.
> The consequence is that every two weeks we push in Nexus ~200Mb of
> artefacts.
> The repository is growing very fast, and although there is a task to clean
> up the old snapshot releases, I haven't found anything for the Release
> repository. Artifacts older than two iterations are useless because they
> have been already replaced at least twice.
>
> Does anyone have any suggestion how to cleanup 'automatically' the Release
> repository?
> bye
> Bruno
> --
> View this message in context: http://maven.40175.n5.nabble.com/Cleanup-strategy-for-Release-repository-tp1044998p1044998.html
> Sent from the Nexus Maven Repository Manager Users List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

Bruno Bonacci
In reply to this post by Stephen Chamberlain (Collis)
Hi,

yes, we follow an agile process, and since we work on a SaaS (Software as a Service), we deploy in LIVE every two weeks.
as you can imagine the artefacts few month older they are really no use at all.

We still have SVN tag in case we need.

I do understand that a release is conceptually forever, but for example RC (release candidates) they can be deleted after the final release is made, or there are other strategies that can be applied in other cases.

thx anyway for your reply.
bye
Bruno
Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

Bruno Bonacci
In reply to this post by Tim O'brien
Thanks Torbien,
I was working on two different approach:

1) cron job that removes underneath the 'older' artifacts and then reindex (as you suggested)
2) or a script that your the REST api to scan, find and delete unwanted releases.

I'll evaluate the best solution...
bye
bruno
Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

YanaNovikova
This post has NOT been accepted by the mailing list yet.
We have the same issue with Nexus Repository Manager where we have many releases of the same applications and some releases being 5-6 years old. We don't want to keep old releases in the repository since we have many applications and have releases for each of them several times a year.

I was trying to see if we could write automated task using nexus plugin api to remove old releases. I was able to create new type of task in Nexus Scheduled Tasks section, but was not able to access repositories and remove artifacts from inside of the task. Now I am thinking to have my task to access the file system, remove old releases and then kick off reindexing repositories.  Would that be acceptable solution to this problem?
Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

Bruno Bonacci

Hi Yana,

I wrote a Groovy script that find Release Candidates and remove them using the REST interface.
RCs represent arounf 60% of the total releases in our repo, therefore removing all RC that have already a final has been a quick win (saved 50% of disk space).

If you know a bit of Groovy is very easy to change the script to implement new strategy to clean up old versions.
If you want I can share the script with you (or anyone is interested);
bye
Bruno

Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

htekwani
This post has NOT been accepted by the mailing list yet.
Hi Bruno,

Our team has recently started using maven as well as nexus repository. Like you we too have a similar requirement of deleting the unwanted release artifacts that keep on accumulating over the period of time and eat up all the disc space.

It would be great if you could share your scripts with me. Basically I wanted to learn about how to use the REST APIs and also have a look at the groovy script as I am completely new to groovy. I wanted to see whether it would be possible for me to adapt to groovy or if I should find some alternate in java.

Thanks in advance,
Henika
Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

Bruno Bonacci
Hi, here is my groovy script (groovy 1.7.1) (source attached)


/*
*
* export NUSER=admin
* export NPASS=<NexusAdminPass>
* groovy NexusCleaner.groovy  | xargs -iF curl -v -X DELETE -u $NUSER:$NPASS F
*
*/

class NexusCleaner {

    def settings = [ 
        baseUrl: 'http://192.168.12.200:8080/nexus/service/local/repositories/releases/content/',
     ];


    def static main( def args )
    {
        def nc = new NexusCleaner();
        nc.findRC();
    }

    def findRC()
    {

        def urls = scanRepo( settings.baseUrl );

        urls.findAll(){ it ==~ /.*-RC\d+\/$/ }.each() {
            ver -> 
            // calculate what would be the final version
            def rel = ver.replaceAll(/-RC\d+\/$/, '/' );
        
            // check if the final version already exists
            if( urls.find(){ it == rel } != null )
                println ver;
        }
  
        // delete irregular SNAPSHOTS ie: 1.0.2.0-SNAPSHOT2, 1.0.2.0-RC1SNAPSHOT2
        urls.findAll(){ it ==~ /.*SNAPSHOT.*\//}.each(){println it }

    }

    def scanRepo( def url ) {
        def urls = [];
    
        def data = fetchContent( url );
        data.data.'content-item'.each(){
            item-> 
            def name = item.text.text();
            if( item.leaf.text() == 'false' )
            {
                if(!( name ==~ /^\d+\.\d+\.\d+\.\d+.*/ )) // it's a release number level
                {
                    urls += scanRepo( item.resourceURI.text() );
                }
                else
                    urls << item.resourceURI.text();
            }
        }

        return urls;
    }

    def fetchContent( String url )
    {
        def txt = new URL( url ).text;
        def recs = new XmlSlurper().parseText( txt );
        return recs;
    }

}

Basically the scanRepo() method scan the repository up to the level of release number (ie: 1.0.3.4).
At the end you going to end up with a list of all URL such as:


http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0-RC1/
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0-RC2/
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0-RC3/
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0/

and method findRC() it searches for all url terminating by '-RCn/' and will print on the std out those RCs that have already a final release.

Then using curl (on linux or cygwin) I send the DELETE command to remove those ones that I want to delete.

$ export NUSER=admin
$ export NPASS=
$ groovy NexusCleaner.groovy  | xargs -iF curl -v -X DELETE -u $NUSER:$NPASS F


The final steps is to empty the trash in Nexus.

Note that if your release number are not composed by 4 digits you'll have to modify this line in scanRepo() to match your release pattern.
        if(!( name ==~ /^\d+\.\d+\.\d+\.\d+.*/ )) // it's a release number level

bye
Bruno

NexusCleaner.groovy
Reply | Threaded
Open this post in threaded view
|

Re: Cleanup strategy for Release repository

Bruno Bonacci
in addition....
The above script can be easily modified to include different strategies.
for instance if you want delete artifacts older than a certain time you can replace the urls list with a map,
and in scanRepo() add more info other than the url like the lastModified timestamp and then filter those older than x months.

bye
bruno
Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

htekwani
This post has NOT been accepted by the mailing list yet.
In reply to this post by Bruno Bonacci

Thanks a lot Bruno. I will have a look at the scripts and let you know how it goes.

 

-Henika

 

From: Bruno Bonacci [via Maven] [mailto:[hidden email]]
Sent: Wednesday, July 14, 2010 8:28 PM
To: Henika Tekwani
Subject: Re: Cleanup strategy for Release repository

 

Hi, here is my groovy script (groovy 1.7.1) (source attached)


/*
*
* export NUSER=admin
* export NPASS=<NexusAdminPass>
* groovy NexusCleaner.groovy  | xargs -iF curl -v -X DELETE -u $NUSER:$NPASS F
*
*/

class NexusCleaner {

    def settings = [ 
        baseUrl: 'http://192.168.12.200:8080/nexus/service/local/repositories/releases/content/',
     ];


    def static main( def args )
    {
        def nc = new NexusCleaner();
        nc.findRC();
    }

    def findRC()
    {

        def urls = scanRepo( settings.baseUrl );

        urls.findAll(){ it ==~ /.*-RC\d+\/$/ }.each() {
            ver -> 
            // calculate what would be the final version
            def rel = ver.replaceAll(/-RC\d+\/$/, '/' );
        
            // check if the final version already exists
            if( urls.find(){ it == rel } != null )
                println ver;
        }
  
        // delete irregular SNAPSHOTS ie: 1.0.2.0-SNAPSHOT2, 1.0.2.0-RC1SNAPSHOT2
        urls.findAll(){ it ==~ /.*SNAPSHOT.*\//}.each(){println it }

    }

    def scanRepo( def url ) {
        def urls = [];
    
        def data = fetchContent( url );
        data.data.'content-item'.each(){
            item-> 
            def name = item.text.text();
            if( item.leaf.text() == 'false' )
            {
                if(!( name ==~ /^\d+\.\d+\.\d+\.\d+.*/ )) // it's a release number level
                {
                    urls += scanRepo( item.resourceURI.text() );
                }
                else
                    urls << item.resourceURI.text();
            }
        }

        return urls;
    }

    def fetchContent( String url )
    {
        def txt = new URL( url ).text;
        def recs = new XmlSlurper().parseText( txt );
        return recs;
    }

}


Basically the scanRepo() method scan the repository up to the level of release number (ie: 1.0.3.4). At the end you going to end up with a list of all URL such as:


 
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0-RC1/
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0-RC2/
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0-RC3/
http://192.168.10.10:8080/nexus/service/local/repositories/releases/content/com/mycompany/project/module/1.0.1.0/

and method findRC() it searches for all url terminating by '-RCn/' and will print on the std out those RCs that have already a final release.

Then using curl (on linux or cygwin) I send the DELETE command to remove those ones that I want to delete.


 
$ export NUSER=admin
$ export NPASS=
$ groovy NexusCleaner.groovy  | xargs -iF curl -v -X DELETE -u $NUSER:$NPASS F


The final steps is to empty the trash in Nexus.

Note that if your release number are not composed by 4 digits you'll have to modify this line in scanRepo() to match your release pattern.

 
        if(!( name ==~ /^\d+\.\d+\.\d+\.\d+.*/ )) // it's a release number level


bye
Bruno

NexusCleaner.groovy


View message @ http://maven.40175.n5.nabble.com/Cleanup-strategy-for-Release-repository-tp1044998p1092857.html
To unsubscribe from Re: Cleanup strategy for Release repository, click here.

 

Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

Bruno Bonacci
No worries,
if you need help let me know..
bye
Bruno
Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

Brian Fox
You could create a plugin for Nexus that introduces a new scheduled task to do your cleanup. It has been requested once or twice in the past.

On Thu, Jul 15, 2010 at 10:33 AM, Bruno Bonacci <[hidden email]> wrote:

No worries,
if you need help let me know..
bye
Bruno

--
View this message in context: http://maven.40175.n5.nabble.com/Cleanup-strategy-for-Release-repository-tp1044998p1223541.html
Sent from the Nexus Maven Repository Manager Users List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

Jeff.Blaisdell
I tweaked Bruno's script to use RESTClient, and rely completely on the Nexus REST API.

It will remove artifacts, and rebuild Nexus metadata.

To run it, you would need http-builder, and all transitive dependencies.

<dependency>
    <groupId>org.codehaus.groovy.modules.http-builder</groupId>
    <artifactId>http-builder</artifactId>
    <version>0.5.1</version>
</dependency>

Note:  I had trouble running it w/ Eclipse Groovy Plugin.  Works great with straight Groovy Console.

Would love it in a nexus plugin, but it'll be tough to find the time to learn another plugin api...

NexusArtifactCleanup.groovy
Reply | Threaded
Open this post in threaded view
|

RE: Cleanup strategy for Release repository

hujirong
This post has NOT been accepted by the mailing list yet.
Oliver stored this code in GitHub, and I have some questions in the comment. Please help. Thanks.
https://gist.github.com/oliverdaff/2233777