Date: Friday, April 05, 2024

Work Undertaken Summary

Risks

TMA02 deadline fast approaching, focus shifting to TMA work

Time Spent

1hr TMA02 1hr research understanding transformers 1.5hr Pepper Diagram 1hr StoryTransformer Dev work (Tagging) 1.5hr StoryTransformer Diagram 1.25hr infini-attention papper 0.5 update schedule

Questions for Tutor

Next work planned

- [If you are registered in a Computing and IT (Honours degree) specialist route]
    
    Will the solution be within the specialism route of my degree?

I’m not on a degree specialisation, I assume a statement in the report would be useful?

Raw Notes

TMA02

TODO continue from here: TMA 02 Review of Work in Progress: 3.1 Preparation and planning | OU online (open.ac.uk)

Pepper Database Diagram First Attempt

---
title: Pepper Database Entity Relationship Diagram
---
erDiagram
    USER ||--o{ MESSAGE : sends
    USER ||--o{ WORKCARD : raises
    WORKCARD |o--o{ MESSAGE : contains
    TICKETBUG |o--o{ MESSAGE : contains
    TICKETTASK |o--o{ MESSAGE : contains
    WORKCARD ||--o{ TICKETTASK : contains
    WORKCARD ||--o{ TICKETBUG : contains
    ORGANISATION ||--o{ WORKCARD : "raises"
    ORGANISATION ||--o{ ORGANISATIONUSER : contains
    USER ||--o{ ORGANISATIONUSER : "works for"
    ORGANISATION {
        Guid Id
        string name
    }
    ORGANISATIONUSER {
        Guid OrganisationId PK,FK
        Guid UserId PK,FK
    }
    USER {
        Guid Id PK
bool IsDisabled 
string FirstName_Calc 
string LastName_Calc 
string Initials_Calc 
    }
    MESSAGE {
        Guid Id PK
string RawText 
string PlainText 
int MessageType "1 = comment, 2 = internal comment"
bool Hidden 
int TicketId FK "Links to WorkCard, Task or Bug"
DateTime CreatedUtc
Guid CreatedById FK
DateTime DeletedUtc
Guid DeletedById FK
bool IsDeleted 
    }

        WORKCARD {
        int Id PK
string Subject 
string Description 
int StatusId 
WorkCardCategory Category "0 = Support, 1 = Chargeable work"     
Guid OrganisationId 
string Requirements 
DateTime CreatedUtc
Guid CreatedById FK
DateTime DeletedUtc
Guid DeletedById FK
bool IsDeleted 
    }

        TICKETTASK {
        int Id PK
string Subject 
string Description 
int WorkCardId FK
DateTime CreatedUtc
Guid CreatedById FK
DateTime DeletedUtc
Guid DeletedById FK
bool IsDeleted 
    }

    TICKETBUG {
        int Id PK
        string Subject 
        string Description 
        int WorkCardId FK
        DateTime CreatedUtc
        Guid CreatedById FK
        DateTime DeletedUtc
        Guid DeletedById FK
        bool IsDeleted 
    }

https://mermaid.js.org/syntax/entityRelationshipDiagram.html

Pepper Diagram second attempt

---
title: Pepper Database Entity Relationship Diagram
---
erDiagram
    USER ||--o{ MESSAGE : sends
    USER ||--o{ TICKET : raises
    TICKET |o--o{ MESSAGE : contains
    TICKET ||--o| TICKETBUG : "is a"
    TICKET ||--o| WORKCARD : "is a"
    TICKET ||--o| TICKETTASK : "is a"
    WORKCARD ||--o{ TICKETTASK : contains
    WORKCARD ||--o{ TICKETBUG : contains
    ORGANISATION ||--o{ TICKET : "raises"
    ORGANISATION ||--o{ ORGANISATIONUSER : contains
    
    USER ||--o{ ORGANISATIONUSER : "works for"
    ORGANISATION {
        Guid Id
        string name
    }
    ORGANISATIONUSER {
        Guid OrganisationId PK,FK
        Guid UserId PK,FK
    }
    USER {
        Guid Id PK
bool IsDisabled 
string FirstName_Calc 
string LastName_Calc 
string Initials_Calc 
    }
    MESSAGE {
        Guid Id PK
string RawText 
string PlainText 
int MessageType "1 = comment, 2 = internal comment"
bool Hidden 
int TicketId FK "Links to WorkCard, Task or Bug"
DateTime CreatedUtc
Guid CreatedById FK
DateTime DeletedUtc
Guid DeletedById FK
bool IsDeleted 
    }

    TICKET {
        int Id PK
string Subject 
string Description 
TicketEntityType EntityType "0 = WC, 1=Task, 2=Bug"
int StatusId 
DateTime CreatedUtc
Guid CreatedById FK
DateTime DeletedUtc
Guid DeletedById FK
bool IsDeleted 
    }

        WORKCARD {
            int Id PK
WorkCardCategory Category "0 = Support, 1 = Chargeable work"     
Guid OrganisationId 
string Requirements 
    }

        TICKETTASK {
            int Id PK
int WorkCardId FK
    }

    TICKETBUG {
        int Id PK
        int WorkCardId FK
    }

The ticketing system used by my employer is an in house product called Pepper. Pepper handles both support requests and normal feature development. This diagram is a slice of the system that is important to my project.

The pepper system uses inheritance as part of its modelling. The Ticket type is an abstract class from which WorkCard, TicketBug, TicketTask derive. They are all stored in the same database table using the Table-per-hierarchy pattern (Inheritance - EF Core | Microsoft Learn). This design choice was to enable bugs and tasks to be easily promoted into their own independent work cards.

classDiagram
    Ticket <|-- WorkCard
    Ticket <|-- TicketTask
    Ticket <|-- TicketBug

Story Transformer

classDiagram
    direction RL
    class StoryTransformer {
        -StoryDbContext context
        -StorySerializer serializer
        +StoryTransformer(context, serializer)
        +RunAsync(outputFolder) Task
        -GetBatchAsync(int batchNumber) Task
        -EnumerateWithIndex(WorkCard[] batch, int batchNumber)
    }

    StoryTransformer o-- StoryDbContext
    StoryTransformer o-- StorySerializer

    class StoryDbContext {
        +DBSet~WorkCard~ WorkCards
    }


    class StorySerializer {
        -IStoryFormatter formatter
        -StoryTagger[] taggers
        -IStoryPseudoAnonymizer[] anonymizers
        +StorySerializer(formatter, taggers, anonymizers)
        +SerializeAsync(outputFolder, workCard, cardIndex)
        -CreateFileContentAsync(workCard)
        -GenerateTagsAsync(workCard, content) string[]
        -GeneratePseudoAnonymizedContentAsync(content) string
    }

    StorySerializer o-- IStoryFormatter
    StorySerializer o-- "*" IStoryTagger
    StorySerializer o-- "*" IStoryPseudoAnonymizer

    class IStoryFormatter {
        +Format(WorkCard workCard) string
    }

    <<interface>> IStoryFormatter

    class MarkdownStoryFormatter {
        -StringBuilder stringBuilder
        -ReverseMarkdown.Converter reverseMarkdown
        +MarkdownStoryFormatter()
        +Format(WorkCard workCard) string
        -AppendField(string fieldName, string value)
        -AppendHtmlField(string fieldName, string html)
        -GetUserName(User user)
    }

    IStoryFormatter <|.. MarkdownStoryFormatter : implements

    class IStoryTagger {
        +AddTagsAsync(ITagCollection tags, WorkCard workCard, string content)
    }

    <<interface>> IStoryTagger

    class KeywordTagger {
        +KeywordTagger()
        +AddTagsAsync(ITagCollection tags, WorkCard workCard, string content)
    }

    class FeatureTagger {
        +FeatureTagger()
        +AddTagsAsync(ITagCollection tags, WorkCard workCard, string content)
    }

    class TimescaleTagRemover {
        +TimescaleTagRemover()
        +AddTagsAsync(ITagCollection tags, WorkCard workCard, string content)
    }

    IStoryTagger <|.. KeywordTagger : implements
    IStoryTagger <|.. FeatureTagger : implements
    IStoryTagger <|.. TimescaleTagRemover : implements


    class IStoryPseudoAnonymizer {
        +PseudoAnonymizeAsync(string content)
    }

    <<interface>> IStoryPseudoAnonymizer

    class RegexAnonymizer {
        -RegexAnonymizer(AnonymizationRule[] rules)
        -AnonymizationRule[] rules
        +CreateAsync(string fileName)
        +AddTagsAsync(ITagCollection tags, WorkCard workCard, string content)
    }

    IStoryPseudoAnonymizer <|.. RegexAnonymizer : implements

    class AnonymizationRule {
        +Regex Regex
        +string Replacement
    }

    RegexAnonymizer o-- "*" AnonymizationRule


    class ITagCollection {
        +Collection~string~ Tags
        +AddTag(string tag)
        +RemoveTag(string tag)
    }
    <<interface>> ITagCollection

    class TagCollection {
        -Set~string~ _tags
        -Set~string~ _removedTags
        +TagCollection()
        +Collection~string~ Tags
        +AddTag(string tag)
        +RemoveTag(string tag)
    }


    ITagCollection <|.. TagCollection : implements
    ITagCollection <-- IStoryTagger

I’ve designed the system to be very modular to allow for rapid experimentation / flexibility / iteration.

Most items interact only via an interface to allow bits to be swapped out.

Most items use dependency injection, inversion of control to allow for wider use / top level configuration.

Added ITagCollection to allow for logic in the adding of tags, e.g. synonyms or deduplication of tags.

The tag generation is done via an IStoryTagger interface, this allows breaking up of the logic associated with generating the tags. E.g. KeywordTagger adds tags based off the presence of keywords, FeatureTagger exploits the structured information from pepper about which feature it was assigned to. A future tagger could call off to an LLM to categorize the card.