Skip to content

feat: New history log api functions [FC-0123]#501

Open
ChrisChV wants to merge 11 commits intoopenedx:mainfrom
open-craft:chris/FAL-4330-history-log
Open

feat: New history log api functions [FC-0123]#501
ChrisChV wants to merge 11 commits intoopenedx:mainfrom
open-craft:chris/FAL-4330-history-log

Conversation

@ChrisChV
Copy link
Copy Markdown
Contributor

@ChrisChV ChrisChV commented Mar 16, 2026

Description

  • Implements python api functions to get and build the history log of an entity:
    • get_entity_draft_history: Return DraftChangeLogRecords for a PublishableEntity since its last publication.
    • get_entity_version_contributors: returns the users who authored changes between two published versions of an entity
    • get_entity_publish_history: Return all PublishLogRecords for a PublishableEntity
    • get_entity_publish_history_entries: Return the DraftChangeLogRecords associated with a specific PublishLog
    • get_descendant_component_entity_ids: BFS traversal of a container hierarchy, returning the IDs of all descendant component entities (used to scope queries when fetching history for
      container children)
  • Which edX user roles will this change impact? "Developer".

Supporting information

Testing instructions

Follow the testing instructions in openedx/frontend-app-authoring#2948

Deadline

Before the Verawood cut.

Other information

N/A

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Mar 16, 2026
@openedx-webhooks
Copy link
Copy Markdown

openedx-webhooks commented Mar 16, 2026

Thanks for the pull request, @ChrisChV!

This repository is currently maintained by @axim-engineering.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@ChrisChV ChrisChV marked this pull request as draft March 16, 2026 23:12
@github-project-automation github-project-automation Bot moved this to Needs Triage in Contributions Mar 16, 2026
@ChrisChV ChrisChV changed the title feat: New history log api functions feat: New history log api functions [FC-0123] Mar 16, 2026
Return DraftChangeLogRecords for a PublishableEntity since its last publication,
ordered from most recent to oldest.

If the entity has never been published, all DraftChangeLogRecords are returned.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended behavior of this function when the last "publish" was a deletion?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current code does work in this case, to be clear. And I realize that it's not something that's likely to come up in the UX. I just want to make sure that the behavior is well defined and tested for in that edge case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes, I'll add tests for that case.

@ormsbee
Copy link
Copy Markdown
Contributor

ormsbee commented Mar 21, 2026

High level comment: please note that it's possible to reset a draft to the published state (i.e. discard changes). When that happens, the Draft version is reset to point to the same version as Published, but all the versions created still exist and are reflected in the draft change log.

@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Mar 23, 2026
@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Mar 23, 2026
Copy link
Copy Markdown
Contributor

@rpenido rpenido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good @ChrisChV!

.values_list("draft_change_log__changed_by", flat=True)
.distinct()
)
return get_user_model().objects.filter(pk__in=contributor_ids)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is worth adding an order_by here to assert consistent results?

@ormsbee
Copy link
Copy Markdown
Contributor

ormsbee commented Mar 28, 2026

Per this slack thread, I think we're going to want to extend the PublishLog model to include a nullable field that captures what it was that the user clicked "publish" on. Right now, if there is a Subsection -> Unit -> Component setup where only the component has changed, the data in the PublishLog does not allow us to differentiate which level the user pressed "Publish" at, and this seems to be a requirement of the UX.

Does this sound correct to you @sdaitzman, @marcotuts?

FYI @crathbun428, @jmakowski1123

@ormsbee
Copy link
Copy Markdown
Contributor

ormsbee commented Apr 6, 2026

I made a separate issue to capture user publish intent (my comment above), so that we can discuss in more detail: http://github.com/openedx/openedx-core/issues/533

Comment on lines +755 to +757
# If reset_drafts_to_published() was called within this publish window, there
# will be a DraftChangeLogRecord where new_version == baseline. Use the most
# recent such record as the new lower bound so discarded entries are excluded.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I was the one who flagged this because I wanted to make sure the code noted it, but I actually don't think we should filter out discarded changes. It's a log, and "discard changes" is a legit thing that someone in that chain of edits did. I think we should be explicit that it happened, especially if people are going here because they're thinking, "I know I was working on this, but what happened to my edits!?!" Then they could see "Dave discarded changes" and know who to strangle talk to.


def get_descendant_component_entity_ids(container: Container) -> list[int]:
"""
Return the entity IDs of all leaf (non-Container) descendants of ``container``.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code in the containers applet shouldn't be querying Draft models directly, or ideally even be aware of components.

Where is this going to get called from?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code in the containers applet shouldn't be querying Draft models directly, or ideally even be aware of components.

I've refactored the function to not use Draft

Where is this going to get called from?

This function is called in edx-platform, in get_library_container_draft_history and in get_library_container_publish_history. This is what I have implemented, as I mentioned in Slack. It might change later after implement the direct field.

@ChrisChV ChrisChV marked this pull request as ready for review April 21, 2026 02:16
@ChrisChV ChrisChV requested a review from ormsbee April 21, 2026 02:16
Copy link
Copy Markdown
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review. Will add more in the morning. Thank you.

Comment on lines +768 to +782
reset_filter = {
"entity_id": entity_id,
"new_version_id": baseline_version_id,
"draft_change_log__changed_at__lte": published_at,
}
if prev_published_at:
reset_filter["draft_change_log__changed_at__gt"] = prev_published_at

last_reset_at = (
DraftChangeLogRecord.objects
.filter(**reset_filter)
.order_by("-draft_change_log__changed_at")
.values_list("draft_change_log__changed_at", flat=True)
.first()
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @sdaitzman, we should represent the Discard Changes action as another kind of draft update, so we don't need to filter them out here.

Comment on lines +889 to +890
Intermediate containers (e.g. Subsections, Units) are never included in the
result; only leaf component entities are returned.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question] Why do we only focus on Components and exclude intermediate containers? Don't we need to include changes to intermediate Containers to build a historical view of the parent Container?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation follows the logic discussed in openedx/frontend-app-authoring#2805 (comment)

For Containers, both lists follow the same structure and contain:

  • All direct edits made to the container itself.
  • All edits included in each publish group, for all descendant components.

E.g. for a section, an edit to a component grouped under a prior publish shows one edit, to that component (not to any in-between levels, or to the direct child subsection that is an ancestor of the component)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does a rename of a Unit never show up at all?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is shown in the unit's history log:

All direct edits made to the container itself.

The logic I understood here is to show the important changes, which are the components, in the log. If we show only intermediate changes, for example, in a section's history log, it can display so many changes that it might overwhelm the user.

The current implementation keeps it compact and with relevant information.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would expect it to show non-side-effect changes, i.e. ones where the actual version_num of the record changes. Because otherwise, we'd miss things like "Deleted this Unit", or "Removed this Subsectino" which are really big changes.

)


def get_descendant_component_entity_ids(container: Container) -> list[int]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're framing this as descendants using container traversal, but publishing follows the (very closely related) concept of dependencies. Unless you absolutely need to frame this in terms of containers, it might make more sense to do dependency traversal similar to how we traverse dependencies for publishing. It will likely be faster as well.

One of the reasons I mention this is that we will eventually have dependencies that live outside of the container->child relationship, like media assets, and eventually course-related things like grading policies.

Copy link
Copy Markdown
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mark the new API calls as "[ 🛑 UNSTABLE ]" in their docstrings, like the containers APIs are.

Otherwise, two very small change requests to keep component knowledge out of the publishing applet. I realize that we may have to revise this pending further UX refinement, but this is really excellent work—very clearly written and well commented. Great job!

.filter(entity_id=entity_id)
.select_related(
"draft_change_log__changed_by",
"entity__component__component_type",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please get rid of this line here, and add it on as a .select_related() when this gets called from get_library_component_draft_history(), like:

    history_qs = content_api.get_entity_draft_history(component.publishable_entity)
    records = list(history_qs.select_related("entity__component__component_type")

Since get_library_component_draft_history() is aware of components anyway and is actively traversing them, I think it's better to keep that knowledge out of the publishing applet as much as possible.

.filter(entity_id=entity_id)
.select_related(
"publish_log__published_by",
"entity__component__component_type",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, please remove this line and add it back in as a select_related call from get_library_component_publish_history()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Status: Waiting on Author

Development

Successfully merging this pull request may close these issues.

5 participants