TheToolsmiths · igorrafael · Apr 15, 2024 · Apr 15, 2024 · Apr 15, 2024
diff --git a/..._issues_in_tools/2022/technical-issues-in-tools-development-roundtable-day-1.md b/..._issues_in_tools/2022/technical-issues-in-tools-development-roundtable-day-1.md
@@ -0,0 +1,30 @@
+---
+title: 'Technical Issues in Tools Development Roundtable - Day 1: Workflows'
+layout: codex_notes_page
+author: Igor de Sousa
+author_url: https://www.linkedin.com/in/igorrafaeldesousa/
+permalink: /codex/gdc/roundtable/technical_issues_in_tools/2022/day-1
+nav_tag: gdc
+---
+{% include JB/setup %}
+
+<br>
+
+# @ GDC 2022
+
+
+## Downtime
+
+- Scheduled: Personal Choices
+- Unscheduled: Low Hanging Fruits
+
+
+## Planning
+
+- Changes
+    - Prescriptive: harder to get buy in
+    - Informed
+- Analysis
+    - Data+insight: allow informed decisions
+- Iterations
+    - Give users time to trial and provide feedback
diff --git a/..._issues_in_tools/2022/technical-issues-in-tools-development-roundtable-day-2.md b/..._issues_in_tools/2022/technical-issues-in-tools-development-roundtable-day-2.md
@@ -0,0 +1,23 @@
+---
+title: 'Technical Issues in Tools Development Roundtable - Day 2: Assets'
+layout: codex_notes_page
+author: Igor de Sousa
+author_url: https://www.linkedin.com/in/igorrafaeldesousa/
+permalink: /codex/gdc/roundtable/technical_issues_in_tools/2022/day-2
+nav_tag: gdc
+---
+{% include JB/setup %}
+
+<br>
+
+# @ GDC 2022
+
+- Lightmaps
+- UVs
+- Queryable databases
+- Automation
+    - Cached
+        - Parameters in Metadata
+    - Submitted
+        - Variations in a .zip
+            - Picked by game code
diff --git a/..._issues_in_tools/2022/technical-issues-in-tools-development-roundtable-day-3.md b/..._issues_in_tools/2022/technical-issues-in-tools-development-roundtable-day-3.md
@@ -0,0 +1,20 @@
+---
+title: 'Technical Issues in Tools Development Roundtable - Day 3: Pipelines'
+layout: codex_notes_page
+author: Igor de Sousa
+author_url: https://www.linkedin.com/in/igorrafaeldesousa/
+permalink: /codex/gdc/roundtable/technical_issues_in_tools/2022/day-3
+nav_tag: gdc
+---
+{% include JB/setup %}
+
+<br>
+
+# @ GDC 2022
+
+- Environment setup
+    - Remote access to managed machine (cloud)
+- Embedded creative
+    - Seal the relationship
+- Well defined process
+- Look for GDC talk about build machines
diff --git a/codex/gdc/roundtable/technical_issues_in_tools/2024/Fr_-_Tools_Roundtable_Build.md b/codex/gdc/roundtable/technical_issues_in_tools/2024/Fr_-_Tools_Roundtable_Build.md
@@ -0,0 +1,62 @@
+Friday - Tools Roundtable Build
+
+Build automation stack
+* [Large studio] Historically we had all our code in Perforce and we relied Jenkins and it was a pain to rely on consultants. We switched over to Git and GitLab CI which was successful because we now had power over it and control over the CI. Perforce is very good for binaries, but not for branch-based development. With smaller tools we have multiple feature branches and the user experience with Git was much better for us.
+* [Large studio] Depending on the studio and game; for Games it was TeamCity or Jenkins. We'd seen some teams use Azure for the pipeline. A lot of our tooling is Azure DevOps.
+* [Large studio] It was everything custom. Flexible, but quite a bit to maintain.
+* There is a GDC talk on automated testing for Call of Duty. Using Python. They used a decorator for Python functions... allowed them to customize different dependencies for each build configuration. Github and Perforce don't allow that kind of customization.
+* [?] We moved all our tools from Perforce to Gitlab and it was great. We use C# for VSI called Cake that minimized the YAML stuff and it was great. I really liked that tech stack.
+* Quick poll of the room on what was used: GitLab, Github Enterprise, A lot of people using Jenkins. Handful of TeamCity.
+** Jenkins: cheap, but you get what you pay for.
+
+Monolithic deployment vs modular approach
+* [?] We moved to a modular approach to the build system, but this introduced problems with managing dependencies between things. What have you been your solutions?
+* [Large studio] Some teams had stuff in a folder in a repo, but that was problematic. We used a central repo for everything and published all packages to GitHub or Perforce with a XML file that ties a tool version to a engine version.
+* [Large studio] If you can manage the source in a way that makes sense, use that also for a tool executable. We've had a tool that pays attention to all the tools/engines on the drive and tries to install the correct tools for that on a developer's machine.
+
+Build times, acceleration, distributed build systems (incredibuild, fastbuild, smdbs, ...)
+* Our mobile builds take 1 to 1.5 hours, do you have tips for improving this?
+* [] Some studios deploy with incredibild, fastbuild. We are looking for a centralize fastbuild service in the cloud for people to use. I know Gears of War is using this? There is no silver bullet. We get very data-driven about build times. Right now, there is a coalition of success with fastbuild and Incredibuild costs money, fastbuild doesn't.
+* [] With our Unity games, it took 40+ minutes to build for Android, and other mobile platforms. Cloud has been expensive and slow. A cache server has definitely helped. This has reduced to ~15mins.
+* Understand and use the incremental state as much as possible. There was a 2022 GDC talk about using local state to improve performance. Incremental build has a lot of edge cases and requires management.
+* For [large studio], someone spent a day setting up metrics which were displayed on Graphana. We could see the tall pole (visualization) for finding the bottlenecks.
+* [Large studio] We used SMDBS both for our code and art/assets because it's free. It's all local and working sufficiently for us. ["Are you on a lan/remote?"] We have both. We have different configurations for remote workers where they build more stuff locally. This is still in SMDBS. ["SMDBS's throughput is really amazing."]
+
+Stage environments that aren't production
+[Large studio] - Do you typically deploy to a test or staging before a final publish? Do you replicate that across all backend-liveop services? Seems like a huge undertaking, but is that useful to you?
+[Large studio] - Our webservices are not as critical, so we are less concerned. It has not been critical on our side for this.
+[Large studio] - We don't have any staging. When I was on the Minecraft team, we had a staging environment using a tool called "Locus" python tool, and it worked out fantastic.
+[Large studio] - When, how, and why do this? If it's a tool for 5 people, it's not worth the effort. With tools for thousands of users worldwide, then outages are a thing that should not happened, with that it was important to have zero-outage mechanisms. You have to judge against the value of that. At Microsoft, then using Azure DevOps.
+[] All of our tools have dev, staging, and production. #memo use of dev/staging builds...
+[Geoff, Call of Duty annecdote] Since we owned the scheduler of the actual build system. We would verify with a dedicated pool of resources and run everything with a minimal set that ran regularly to catch showstopper bugs. The whole thing was automated. As we got closer to shipping, while people were looking for bugs, we pivoted more into test.
+
+Device deployment and how do you manage them
+[Large studio] There are a lot more devices in our landscape now.
+[Large studio] Builds and tools have different strategies. For builds, we pushed signed builds to different consoles for playtests. For tools, we'd have our own packages and the teams would pull them down.
+[] Us, too. The tools and game build are separate. We push our tools to nexus NuGet repository. The tools poll the repo for updates.
+[Large studio] We'd put people's IPs in the infrastructure so that we could just build to those kits directly. Playstation allows you put a web url. When there is a .json file on a particular webserver, the devkit would poll the server and immediately pull the build.
+[Mobile] We'd have a QR code that a mobile device could take a picture of and pull the build from there. It was really helpful.
+[Mobile] If you used a mobile device to build, the physical battery would expand over time!
+[Large studio] We'd use IR cameras to look for overheating servers and to diagnose why...
+
+Build reports and communicating when things fail
+[] Our DevSupport team would look and generate reports. The dashboard of the CI would have red alerts. We used slack a lot for alerts.
+[] When a build breaks, we'd look at Perforce and email people.
+[] A fun way to encourage others, so we set up smart lights in the office that were red. That worked very good.
+[] One thing I saw -- They modified Unreal game metadata server that would go to TeamCity and find the branch that broke the build. And even have user feedback on a particular build built into this. #toolfeedback
+[] You'd get a slack message if your build breaks the build. If you don't reply in Slack in X minutes, your build is reverted immediately.
+[Game] We used the TV screens in the TV area to show a big green or red box based on the success of a build and 'shame' people. "Information radiators".
+[] We added a "not me" button to say that I didn't break the build, but everyone clicked it too much, so we added a text field for "Who's fault you think it is"
+[Internal engine] We had a tool that checks to see a broken build and checks with you. It also had a "not me" button. People would follow up with you from Support, if you broke the build.
+[Security engineer, independent] We had a bunch of red and green lights and integrations with Slack and email. The manager would also get notified if it happened too much.
+[Large studio] Frostbite's tool mentioned above is very scary for me to use.
+[] How about presubmit validation, that would help. Any further stories?
+[Internal engine] We do have a check before submit that takes your submit through a lot of CI tests, but it is more focused on the C++ side rather than the tooling side.
+
+Performance measurements and reports
+[] How do you performance test? You have to simulate devices, users etc... what tech stack, etc..?
+[] Does anyone else perftest their tooling? Simple rest services that easy to do, but how about large file issues like crash dumps?
+[] For internal tools, one thing we are doing is putting a lot telemetry. Then, we'll get the user's bandwidth, time to download, etc...
+[] Our focus on telemetry is on the tool against a particular SDK or service. Especially when there is an interaction with another service, that's were we often see failures.
+
+
diff --git a/...x/gdc/roundtable/technical_issues_in_tools/2024/Th_-_Tools_roundtable_Assets.md b/...x/gdc/roundtable/technical_issues_in_tools/2024/Th_-_Tools_roundtable_Assets.md
@@ -0,0 +1,61 @@
+Th - Tools roundtable: Assets
+
+Baking assets - where, when, tips?
+* Do you allow users to submit partially baked assets?
+* Constantly monitored incremental builds. Cache which is constantly updated. Everyone is pushing to a cache that keeps the build metadata.
+* Wwise has constant rebuilt sound banks. "I don't trust my designers to build everything" there is a build job always running, churning away. Yet, they can build locally themselves if they need to.
+* At [Large studio], we had a heavy reliance one determinism -- so everyone who pushed could cause issues. Our CI system ran binaries that would conform to one -- and all assets went through CI. 
+* You might be choosing a certain type of baking because it's better for your general users.
+
+Assets - build or buy?
+* High visual studios: We want to previsualize what our characters would look like. I'm hoping for libraries or solutions that would compose them across all of our tools. 
+* You could buy, then build. The common assets can buy, but you'll want to customize some.
+* [AAA studio] We have basic solutions. Asset Tracker tracks Perforce check ins and links to #Jira and provides screenshots. #memo Jira is used by many teams and is a legitmate output for interoperatability
+* In [Large studio], we build kits with big showrooms for everything that you can copy and paste things from another editor. It's intuitive and you don't need another UI or tool.
+* "Zoos" and "Gyms" in studios that have all the assets and everything with effects running on it. When I worked on Hitman, we had a whole level with a bunch of stuff and easy to reference -- you don't have to look for a name, just see and grab.
+* Students forced to make their own assets created simple assets, but there was greater creativity.
+
+* A significant number of people using AI to tage metadata for assets. 3 or 4 people have actually done this.
+* At ___ we call this Asset Index. 
+
+Storage - single source of truth
+* [Large studio]: Has anyone used something other than Perforce or homegrown.
+* [Large studio]: Our animators decided to just use a Google Drive. That was not recommended.
+* For a while it was an AWS bucket with folders with whatever names. Not recommended.
+
+Game vs Cinematic assets
+* We do games fast, like half a year. We outsource all the cinematics. So, they need to get the latest models/items, etc. How do we make sure they have the correct version of everything in those shots.
+* We try to separate the Art stream and the Design stream.
+* [Large studio] we had a [Dev only] flag to make sure developer art was filtered out of a gold release.
+* Taking regular screenshots and diffing the screenshots, you can use it to catch changes that might have slipped through the cracks.
+
+Automated asset validation/verification tools
+* Making sure names don't have typos, etc.. without a manual reviewer or making sure combinations are compatible with each other.
+* There was a tool out there that tried to take screenshots.
+* We had validations in Unreal, BluePrints, can' submit without the right conventions. Using drop downs to validate. And, empowering people to create their own validations seems important since requirments might change.
+* Gear might be required to be valid because, as armor, it has to x y or z. This is hard to check, but also custom.
+* [Large studio]: Asset validation, tools developers won't know all the disciplines. You want to empower those developers to do it themselves. Throw it in their face for them to correct it at the time of submission rather than later when it has been baked.
+* Clarification: How often do people don't trust the people to do it... plus does it work with every animation that it needs to work.
+
+Asset unique ids vs pet names
+* We generate massive worlds. We don't want to pull a Perforce repo b/c it's 300+ Gb. 
+* Unreal has a way to cook a partition of the game to generate a minimal part of the build for you.
+* Or, you could build a base image and have perforce do a sync for the changes you care about.
+
+Remote work and pipelines
+* Bandwidth - Parsec is in use, sure. - Can you get your tools actually local. 
+* If you do have assets locally -- we've been investigating file shares, etc...
+* Spiderman 2: We are all working remotely. We convert everyone to P4 VFS with stub and files and makes their working set a tiny subset to build the entire game. We are using the checksums from Perforce as cache ids to get to the cache and this avoids using the source tree and minimizes the transfers. #Perforce stubs it in in time.
+** How much work did you have to do to tell your files. #usecase #remotework #P4
+** [Large studio]: we had a virtual filesystem talk about this
+** Hoardstorage and GDC cache (sp?) are tools experimented with
+* Guerilla talk exists on this last year
+* I tried VFS (Perforce) didn't seem to be enough for us.
+* Another alternative, the CoD engine, we brute forced the metadata so that you can use that (to dip into a 'cache') to determine if you needed to pull and rebuild/reload. Using caching solutions to solve massive amounts of data.
+
+Seems like naming is an obvious painpoint. what else is an artist not aware of?
+- They are columns, not pillars
+- Spell checker was #1 
+- Artists aren't reading the text and not following the process
+- Talk to the developer of the tool. Tell us how we can improve the tool for you
+
diff --git a/...undtable/technical_issues_in_tools/2024/Wed_-_Tools_Roundtable_-_Engineering.md b/...undtable/technical_issues_in_tools/2024/Wed_-_Tools_Roundtable_-_Engineering.md
@@ -0,0 +1,68 @@
+Tools Roundtable - Engineering
+
+Debug issues for games on proton.
+- We have used logging and Renderdoc, but have been running into issues.
+- Heard that you can run Visual Studio remote debugger on Proton
+
+We only have 2 engineers... shortcuts?
+- We're making WPF tools and editor work for Unreal. Are there resources/libraries?
+- [Large studio]: I feel your pain. We have tools with 4000 active users supported by 2 people. Sometimes, we just couldn't support a tool. Solution was to use a plugin architecture, usable/extensible API, but not integrate risk bringing it back into tools. Lots of upfront work and community management, but useful.
+- - Extensibility: an abstraction layer.
+- - How do you design that abstraction layer? -- EA: What we did was have a middleware pipeline and plugins interact with it before and after. Lots of isolation and focus and wrapping and exception handling: in order to show that "it's not our fault".
+- [Large studio]: We basically tell the game code to do whatever they want. If another game team wants it, then we talk about integration.
+- Remember not to build things that are not going to be used.
+- A lot of times you are defined by what you don't do as much as what you do do. When you say no, it makes it real for them to understand the impact when you don't do that.
+- [Large studio]: Why isn't the open source help get me kickstarted?
+
+MS Blazer instead of XAML - (Ubisoft Montreal)
+- C# web development easy to move to from WPF development.
+- [Large studio]: You can go Blazer or React.
+- [Large studio]: What's the current state of Blazer desktop apps?
+- Easier for designers to test and easier for cloud deployment.3 or 4 blazer
+7 XAML
+1 WinForms
+12 Qt
+1 Front-end browser
+
+Tools distribution - How do users get the stuff?, Versioning
+- How do you actually install stuff and get stuff on machines?
+- There are still random: Cholately scripts, untrusted exes
+- Perforce
+- [Large studio]: We use nuget, and create a "fake executable" that you control the version with.
+- [Large studio]: At Volition, we had an entire install system that pulled down the initial setup, pulled everything from Perforce, looked at the registry, then, ran the installer. Set up so that it was basically turn key.
+- Cloud-based, mount a network drive and have them use that.
+- Use REZ. Package management system to distribute tools and python source code. REZ does versioning very well. 
+- We abuse Unreal Game Sync and use that + launcher for our tools there.
+- Yarn for web was the standard -- surprised that there isn't an equivalent in game development
+- Versioning:
+-- Casino game have smaller games, high velocity. 20 games in a single year. We have branching issues and complexity.
+-- If you have a mono repo, that's a different problem than if you are submodule based
+-- What you decide to build as a feature flag is another aspect of this.
+-- Nuget packages has a version schema, this helps distinguish between early adopters and regular users.
+-- [Large studio] has your own propietary package manager and application launcher. All versioned. Each team manages an XML file to do these things. This application is the first time install.
+
+Authoring tools supporting multiple platforms on the tools. 
+
+API standards and interoperability
+- So many companies just take what you've got as long as you made the game to the quality you want
+
+Automated test for gameplay
+
+Default settings for your tools
+
+How do you get devs to write patch notes and docs? And Release notes.
+- [Large studio]: We tried to move from Confluence to Notion, it didn't really help.
+- We have one person at the end of every work period coallate everything.
+-- [Large studio]: It's not allowed to update without release notes. The CI system does this check.
+
+
+Reproducing bugs in tools.
+- [Large studio] shadowplay is constantly recording and useful for the last 5 minutes of an issue.
+- Crash dumps
+- [Large studio] - in the past - We would log all user actions into a journal so that when there was a crash/misbehavior, you could get a breadcrumb of information to reproduce it
+- At [Large studio], with cloud developer, we need observability - a set of observability principles that involve logging, telemetry, and metrics to get more information to diagnose when something is going wrong.
+- 
+
+Inner source vs Open source. (though perforce sucks at that)
+
+