Skip to content

Add script to remove uninstallable revisions from lock files#838

Closed
bernt-matthias wants to merge 14 commits into
usegalaxy-eu:masterfrom
bernt-matthias:fix-uninstallable
Closed

Add script to remove uninstallable revisions from lock files#838
bernt-matthias wants to merge 14 commits into
usegalaxy-eu:masterfrom
bernt-matthias:fix-uninstallable

Conversation

@bernt-matthias

@bernt-matthias bernt-matthias commented Mar 20, 2025

Copy link
Copy Markdown
Contributor

Tested this manually here: https://github.com/Helmholtz-UFZ/ufz-galaxy-tools

I also have code for parsing the tool versions from all the tool revisons (get_all_versions) tested this. In all cases it was unbumped tool versions.

Comment thread scripts/fix-uninstallable.py Outdated

@bernt-matthias bernt-matthias left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to make a (manual?) test run. But since after the fix step there is another lint step before the actual upate not much could got wrong... I guess :)

Comment thread scripts/fix-uninstallable.py Outdated
if nxt:
if nxt in tool["revisions"]:
print(f"remove {cur} in favor of {nxt} {name} {owner}")
remove.append(cur)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script will remove an uninstallable revision from the list if the next installable one is already in it.

Comment thread scripts/fix-uninstallable.py Outdated
@bgruening

Copy link
Copy Markdown
Member

Hi guys, I lost a but track here. I'm happy to merge it and keep the script here, but please feel also free to create a run-PR of this script so we see the impact and to clean the repo.

Thanks

@martenson

martenson commented Apr 7, 2025

Copy link
Copy Markdown

@bernt-matthias I think this is close to perfect, I played with it and opened bernt-matthias#1 to fix the process of adding the next installable revision (and there is also some polish if you want), but apart from that this seems to work great.

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

@bernt-matthias I think this is close to perfect, I played with it and opened bernt-matthias#1 to fix the process of adding the next installable revision (and there is also some polish if you want), but apart from that this seems to work great.

Hi @martenson .. Also have some improvements nearly ready. Will merge with your efforts and push soon.

@martenson

Copy link
Copy Markdown

@bernt-matthias looks awesome, I'll test some more tomorrow. I think renaming of the script could improve its appeal: uninstallable is not the same as non-installable. Locally I have it as fix_outdated but tbh any other name would be better. :)

Comment thread scripts/fix-uninstallable.py Outdated
@bernt-matthias

Copy link
Copy Markdown
Contributor Author

Two more comments:

  • It seems that there can be uninstallable revisions that are not due to having the same version as a later revision (e.g. flye from bgruening de24438c9988 where the next installable revision would be 8d4f03b5fe9d, graphclust_align_cluster has no installable revisions). No idea what is the reason for this. But I changed the code such that there is only a warning in such cases.
  • I found at least one case on eu where the current revision was still installed. The repo info from the API contains some error, but the repo is still reported as installed, I guess that the update to the newer revision (with the same version) failed. Also only logging in these cases now.

TODO

  • Should we change CI such that only for trusted repos revisions are added?
  • For production we should cleanup the temporary clones of the TS repos

Currently running this on this repos. Will need a bit.

@martenson

martenson commented Apr 7, 2025

Copy link
Copy Markdown

It seems that there can be uninstallable revisions that are not due to having the same version as a later revision (e.g. flye from bgruening de24438c9988 where the next installable revision would be 8d4f03b5fe9d

Well that seems like a toolshed bug :/

graphclust_align_cluster has no installable revisions

This tool seems pretty broken, the xml is likely invalid but maybe it was not before?

I think in all these cases we shouldn't modify the lockfile. Maybe we could create an issue for each and try to track down the problems, if there are any takers.

@martenson

Copy link
Copy Markdown

for the latest version I've also encountered this:

Traceback (most recent call last):
  File "/Users/marten/devel/git/galaxy_tools/scripts/fix_outdated.py", line 196, in <module>
    fix_outdated(args.lockfile.name, args.toolshed, args.galaxy_url)
  File "/Users/marten/devel/git/galaxy_tools/scripts/fix_outdated.py", line 124, in fix_outdated
    all_revisions = get_all_revisions(toolshed_url, name, owner)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marten/devel/git/galaxy_tools/scripts/fix_outdated.py", line 67, in get_all_revisions
    assert proc.returncode == 0, f"failed {' '.join(cmd)} in {repo_path}"
           ^^^^^^^^^^^^^^^^^^^^
AssertionError: failed hg update tip in /tmp/repos/toolshed.g2.bx.psu.edu-galaxyp-cardinal_combine

the entry looks like this:

- name: cardinal_combine
  owner: galaxyp
  revisions:
  - 2b1868ab3724
  - a9a28e46d54a
  - bf06d196e23d
  tool_panel_section_id: proteomics
  tool_panel_section_label: Proteomics

so in my understanding this should just remove the 2b1868ab3724, but hg fails with the above

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

So far, for me, deleting the temporary directory for this repo solved the problem (so maybe temporary problem with the remote?). Wondering if we should include stderr, stdout here.

@martenson

Copy link
Copy Markdown

the specific repo seems broken :/

$ hg clone https://toolshed.g2.bx.psu.edu/repos/galaxyp/cardinal_combine
destination directory: cardinal_combine
requesting all changes
adding changesets
adding manifests
adding file changes
added 21 changesets with 769 changes to 260 files
new changesets 93be1d20b5c3:a9a28e46d54a
updating to branch default
abort: case-folding collision between test-data/cluster_skm.RData and test-data/cluster_skm.rdata

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

Works for me

hg clone https://toolshed.g2.bx.psu.edu/repos/galaxyp/cardinal_combine
destination directory: cardinal_combine
requesting all changes
adding changesets
adding manifests
adding file changes
added 21 changesets with 769 changes to 260 files                                                                                                                     
new changesets 93be1d20b5c3:a9a28e46d54a
updating to branch default
218 files updated, 0 files merged, 0 files removed, 0 files unresolved

@martenson

martenson commented Apr 8, 2025

Copy link
Copy Markdown

yeah, case-sensitive vs case-insensitive filesystem :/

xref galaxyproteomics/tools-galaxyp#791

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

To get an idea I added the changes to the .lock files and the log output (.fix).

Do not merge yet. Have another idea...

@bernt-matthias bernt-matthias marked this pull request as draft April 8, 2025 22:04
@martenson

Copy link
Copy Markdown

I propose showing the changes after run in a separate PR, it makes this PR hard to approach to everyone else

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

My idea/suggestion is the following: we could have an additional much simpler step that just determines the installed revisions of a given Galaxy instance. Then we just remove everything from the lock files that is not installed. Should be super simple to do.

Should achieve similar results, but with much less resources: This script downloads GBs from the TS and just takes a lot of time -- at least for the first run - I guess subsequent runs will be much faster.

@martenson

martenson commented Apr 9, 2025

Copy link
Copy Markdown

@bernt-matthias I understand the urge to optimize but, as you said, to me this is basically a one-off cloning spree triggered only during the first run. Subsequent runs will have one repo to clone at most.

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

@bernt-matthias I understand the urge to optimize but, as you said, to me this is basically a one-off cloning spree triggered only during the first run. Subsequent runs will have one repo to clone at most.

Sure, but my main motivation would be that its much simpler, i.e. makes less assumptions on things that are "not well documented" and understood by only a few people.

@martenson

Copy link
Copy Markdown

I think that is a fair take.

It likely does not cover the situation where you actually want to update the revision of the repo that did not bump tool version and maybe other corner cases, but likely does majority of the cleanup we want.

@arash77

arash77 commented Oct 17, 2025

Copy link
Copy Markdown
Member

Isn't it possible to get the data from ts.repositories.get_repository_revision_install_info(name, owner, revision) instead of cloning it?

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

Isn't it possible to get the data

Which data?

@arash77

arash77 commented Oct 17, 2025

Copy link
Copy Markdown
Member

Which data?

repository metadata (all tool info, including versions)

For example, this has a lot of information, including the valid_tools that might be useful for our case:
https://testtoolshed.g2.bx.psu.edu/api/repositories/get_repository_revision_install_info?name=multiqc&owner=iuc&changeset_revision=08188e2202aa

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

I see: the tool version could be extracted from the guid. This could then replace get_all_versions. How about get_all_revisions?

@arash77

arash77 commented Oct 17, 2025

Copy link
Copy Markdown
Member

I see: the tool version could be extracted from the guid. This could then replace get_all_versions. How about get_all_revisions?

Why do we need all of them?
We already have the revisions in the lock files and the ordered_installable_revisions as well. Shouldn’t that be enough?

@bernt-matthias

Copy link
Copy Markdown
Contributor Author

Could be impossible to get_next if the list of revisions is incomplete?

@arash77

arash77 commented Oct 17, 2025

Copy link
Copy Markdown
Member

Could be impossible to get_next if the list of revisions is incomplete?

Do we need this?
From what I understand, if we have only one installable revision for each version, Galaxy installs the latest revision automatically anyway, right?

@bgruening

Copy link
Copy Markdown
Member

This is now in a slightly different version in production.

@bgruening

Copy link
Copy Markdown
Member

Thanks @bernt-matthias!

@bgruening bgruening closed this Mar 18, 2026
@bernt-matthias

Copy link
Copy Markdown
Contributor Author

Can you xref the version you use?

@bgruening

Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants