Hello,
I know there are a lot of project available using SD or emulating disk drive.
I am trying to make a new one available to the community, stable and cheap.
I have started a new project to emulate the SDISK II using a STM32 bluepill and a SDCard.
Once it is fuly working I will add a 0.96 screen, 3 buttons and make a dedicated PCB everything for less that 10$.
I am almost done with the prototype, but I need to debug a bit what the Apple II is getting from the stm32.
Is there a soft that could dump on the screen easily the content of a floppy drive track sector ?
I found this article Apple Floppy Dumper but not the program that comes with it ?
Thanks
Vincent
For DOS you could try Bag of Tricks. Or with ProDOS use Blockwarden.
Nice thanks I will give it a try.
In post #1, 'VIBR' wrote:
" I am almost done with the prototype, but I need to debug a bit what the Apple II is getting from the stm32. "
Uncle Bernie comments:
The 'acid test' for any Apple II floppy emu are the WOZ files (more specific, the WOZ 2+ types). This is because WOZ files demand faithful reproduction of relatively complicated copy protection formats, which of course will, at times, violate DISK II GCR and formatting rules to create the desired effect. In which proper timing of all the RDDATA pulses is critical.
The task is complicated by the fact that some games do not tell you if 'higher stage' copy protection checks have failed. Instead, they subtly change the gameplay parameters such that the player can't win. For instance, in "Prince of Persia", a copy protection fail will turn the content of a vial the avatar must drink to win into poison. Typically, these staged copy protections will do a simpler check while loading and will fail the load when something is grossly off or the copy protection is missing completely (naiive copy without proper tools). But the higher stages will check the copy protection for more subtle details and won't tell you anymore if the check fails.
This said, a thorough test of your emulator will be extremely time consuming and tax your game playing skills...
So the solution I suggest is to work without your hardware first. Instead, build a faithful, cycle accurate "C" model of your emulation code which you then plug into one of the open source Apple II emulators which support 'WOZ' files. Your "C" model will "look" at the control signals of the 74LS259 addressa ble latch of the DISK II emulation code of the emulator and it will have its own access to the WOZ file. Implement a compare of the RDDATA streams of your "C' model and of the emulator. All you need to do then is to find WOZ files on the web and load them into that emulator. Any difference in the RDDATA streams (after being synchronized with the Q3 clock) means your emulation of the floppy disk read process is not yet perfect. The emulation of the floppy disk write process is easier because there are less timing ambiguities involved (vs. the read process). You can do a digital PLL locking to the changes of the WRDATA stream to get the individual bit cells.
For which games to try, look at the various threads concerning the development of MMU substitutes.
I use similar techniques for all my reverse engineering and MMU, IOU and IWM substitutes development work. For instance, at the moment I have "C" models - which are almost "RTL" - running in lockstep with a real IWM interfaced to the printer port of a notebook computer. Some 10 million simulated clock cycles later (which have about 20 minutes runtime) the software tells me about mismatches, timing discrepancies, and produces charts telling me which transitions were taken in all the state machines involved (actually, there are automatic checks to report transitions never taken, which means the input vectors are not exercising all functionalities).
I'm not tempted to go to a hardware implementation before I have full confidence that the models match the real IWM. And with your work you are in the same situation. Unless you have 100% match of a software model of your floppy disk emulator with what others have implemented in Apple II emulators it is just a waste of time to go to experiments with real hardware. You will not get anywhere in any useful amount of time. But if you go the path I have shown to you, you can leverage many, many man-years of work from very skilled and experienced developers who implemented the WOZ file based floppy disk emulation in these open source Apple II emulators. All this is NOT trivial and any small mistake and oversight you might have in your ST32 code may cause a lot of trouble (and loss of your time) down the road. But if you can prove that your model matches the proven work of the "experts" in Apple II copy protection emulation, they you can achieve a truly stellar product with your floppy emulator.
- Uncle Bernie
Thanks Uncle Bernie for your very detailed post. Maybe it deserve to have more information on the project I am working:- It leverages the Apple II Disk controller card- It will use the NIC file format (6to2 encoding from a DSK file format).- from hardware standpoint : - I2C 0.96 mini Oled screen - 3 buttons to select the disk file - STM32 F103 (72 Mhz CPU, 20 kb RAM) - SDCard SPI
I have read the very usefull dbook beneath Apple Dos as a source of inspiration.
The challenge is to achieve the Apple II to load the dsk file like it would be done on a Disk II
From a software standpoint : I use - a circular DMA double buffer to SPI on the STM32 with a 1us duty cycle and 3 us delay (to have a 4us cycle between 2 data pulse), - 4 GPIO Interrupt for head move signal Step0, Step1, Step2, Step3
it is almost working as I am able to load the dsk and I start to see game (BoulderDash II) prescreen on the Apple II and then it crashes.
I suspect the double buffering is doing something wrong when head move is happening: - I prepare in advance the next file sector to be sent, - When head move signal is received I finished the next sector sending and I prepare the new track / disk sector.
I assume the Disk controller is expecting straight away the next disk / sector.
I will try to find a way to stop the SPI DMA and prepare the new buffer to be sent.
Vincent
The disk controller does not care abot tracks and sectors, these all (except for boot sector) are detected under software control. When head moves to another track the software waits (if no protection involved) decent amount of time that shoud include the head movement time to the new track and mechanical stabilization time at this new track and then starts to pay attention to the data stream that comes from the floppy drive. During the head movement the data received from the drive is garbage/noise. Head movement is much slower than one sector time especially of head carret should travel between tracks 0 and 35. Look for your problem elsewhere.
Thanks Retro_devices,
to be more specific, the way i implemented it in STM32 is :
the Buffer is filled with the content of the sector and DMA is sending the data buffer to the SPI GPIO (Read pulse every 4us).
While sending it is preparing the next sector to be sent. It means that while head move, the program will continue to finish the current sector and the next one before moving to the next track and this is where I might have an issue. Otherwise, I use logic analyzer and timing is perfect just like data comparison.
My question is what is the tipical timing between 2 sectors on the same track and timing between data stream betweeen change of track
This is the output of the Logic Analyzer, very clean
Logic Analyser output
same with more details
detailed view
You should place a whole track (with necessary preprocessing) in the DMA circular buffer. You do not have to care about sectors.
I tried by a track is 16 sectors it means
16*512*4 (4us), => 32 Ko and I only have 20 k of RAM... I need to split it
Vincent
If you split it your emulator is doomed to be buggy and faulty. Try to find another solution. If you want to stick with the same STM I would have chosen 4 times less DMA/SPI clock, e.g. buffer, and on positive edge of its output you could form a roughly 1uS pulses by external circuitry (monovibrator, RC + rectifier + TTL buffer, etc.). Also read about WOZ format and try to implement it in the first place.
In post #7, 'VIBR' wrote:
" While sending it is preparing the next sector to be sent. It means that while head move, the program will continue to finish the current sector and the next one before moving to the next track and this is where I might have an issue. "
Uncle Bernie comments:
This approach won't work. Many Apple II copy protections move the head while continuing to read. One example is "spiral tracks". Your floppy emu must be able to handle this properly. To do this it must know if it is a 'spiral track' or a 'fat track' protection. AFAIK, only a WOZ file has all the relevant information inside. But I never looked at DSK file contents ... DSK is obsolete since WOZ came out.
In no case you should need a 20 kByte circular buffer. You don't need 4us spaced pulses with RAM wasting "empty" space in between, because the DISK II controller only looks at falling transitions of RDDATA which mean a "1" bit in the DSK or WOZ files. You could use a few gates on SPI's data and clock to make these falling transitions and then extend them in duration / pulse width as required. Look at the DISK II controller card schematic, it's the inverted (NAND gated) 2 MHz Q3 clock which clocks two D-FFs which synchonize the RDDATA and another NAND detects the falling transitions. As long as you get the timing right you could just turn each "1" of the SPI into a "101" where the "0" is the negative pulse which is wider than a Q3 clock period, so the DFF can "see" the "0".
In this scheme you don't need to expand the few (< 6000) disk bytes per track into larger buffer contents. You could use them directly and so you could keep three adjacent tracks in the ST32 RAM, to support all the different "moving head while reading" copy protections. You need to keep these buffers updated BEFORE the stepper motor phases change...
There is one issue left ... instable bits. WOZ files support that. What this means is that for each revolution of the simulated floppy disk, you need to toggle specific bits in the buffer to produce the desired effect. But this is trivial to do.
I never said it would be easy. But I think you can do it. Using the STM32 seems to be a smart choice. Apple II copy protection emulation is much simpler than for any other floppy disk system, because there is no real PLL in the DISK II. We can be grateful to Woz to have simplified the concept to the bare bones. If he had used a real PLL (like the Western Digital floppy disk controllers of the 1980s do) there would be a whole Universe of very tricky, very difficult to emulate copy protection such as frequency density modulation, PLL pull-off, etc., all techniques which to me look eerily similar to electronic warfare techniques like "range gate pull off" or "velocity gate pull off" in typical late 1950s and early 1960s ECM systems which were meant to protect military airplanes against being shot down by SA-2 missiles over Vietnam or other hotspots. (We can really think that the "copyprotection wars" of the 1980s were a sort of electronic warfare following the pattern ECM, ECCM, ECCCM ... except the casualties were not people but companies whose copy protections failed to protect their software against "software pirates" ... but the truth was that these failed companies were just too cheap to buy the best protections available. I myself had developed one which was undefeatable without immense hardware investment but no takers ...)
- Uncle Bernie
P.S.: Recently, I made a thread here on Applefitter about the "tome" of Apple II copy protections:
https://www.applefritter.com/content/found-great-book-apple-ii-copyprotections-explained
It might help you with your mission ...
Thanks the detailed explaination, I better understand the gap between what I have done and what left to be done and this is a real mission.
I think I can divide the DMA SPI buffer by 4 using a DFF edge detecor with maybe a 74LS175 and with NAND gate coupled with a 74LS02 with RC.
I have read very carefully the technical spec of the WOZ file format and there are a few things that are not so clear.
I am currently using NIC file format that is the represnetation of the 6x2 GCR encoding. I read the SDCARD cluster and get a buffer that is almost what i need to send in the circular buffer.
Question 1 : using WOZ do I need to encode to GCR 6x2 ?
Question 2 : Having in // 3 adjacent track (The selected one + the one before and the one after), assuming that the GCR buffer is 408 Bytes, I need 3 x 408* 16 which gives 19584.
Vincent
I do not understand what the flux encoding is providing compare to the NIBBLE format ?
In post #13, 'VIBR' wrote:
" Question 1 : using WOZ do I need to encode to GCR 6x2 ?"
" Question 2 : Having in // 3 adjacent track (The selected one + the one before and the one after), assuming that the GCR buffer is 408 Bytes, I need 3 x 408* 16 which gives 19584. "
Uncle Bernie comments:
You actually do not need to encode or decode anything. Just treat the whole track data from the WOZ file as a sequences of bits, one bit per bit cell (of 4 us) and move that bit stream out by SPI at a rate of 4us per bit. You need to do something to "wrap around" the DMA to the beginning of the track buffer once a complete revolution of the simulated floppy disk has happened. There must be no gap / pause in the bit stream. I don't know how the ST32 does DMA and SPI but I did use the ST62 family many decades ago. Which had no DMA, but a SPI peripheral which would put the SDO and SCK on two pins. I think with just one TTL package you could do some logic which makes a brief, say 1.2us to 2us wide (not critical as long as > 1us) '1' pulse on RDDATA for each '1' being shifted out by the SPI peripheral. You could use one 74xx121, 74xx122 or 74x123 oneshot to do that. All you really need to do is to turn each '1' seen in the SDO to a short negative going pulse on RDDATA, and because a string of 1's i.e. '111' needs to make three pulses, you need the SPI clock information on a pin to do that. But internally of the ST32 you just would need one byte of RAM per disk 'nybble'. Which contains GCR but you better ignore this fact to avoid confusion and extra work ;-)
In other words, do not try to decipher anything in the bit stream. Just load it into your track buffers and pump it out by SPI, and it will work.
If you don't follow that advice and try to work on a GCR level then it will get needlessly complicated.
This also preempts your question Q2. While the RWTS code running on the Apple II needs more extra RAM space than 256 bytes per sector, to be able to put the GCR together and translate it back to unencoded bytes, all this is completely irrelevant for a floppy emu 'track level' functionality. Because if you do it right, you NEVER try to understand GCR and NEVER try to encode/decode the bit streams you are dealing with. Because this is formatting which is better dealt with on a RWTS level, meaning in the Apple II itself. It won't work with most copyprotection methods anyways because these do not follow Apple's formatting rules (otherwise DOS 3.3 could copy such 'copyprotected' disk).
What a floppy emu must do is to just stupidly provide the bit cell streams for each track even when the 'head' is being moved. Since even the ST32 is too slow to load a track buffer upon detection of a head move, you need to work preemptively and for each active track currently being read, preload the track buffers for the adjacent tracks - the WOZ file will tell you about the nature of these tracks (normal, spiral, fat) and whether they are normal, half or quarter track spaced. I'm not too much of an Apple II copyprotection expert as I was in some 'other' camp when the 1980s "copyprotection wars" were raging, but I'd think that any head movement over a larger distance (to a really different track with different contents) could be accomplished AFTER learning that the stepping motor phases have changed. Because this inevitably means that the floppy controller will lose sync and needs to reacquire sync once the head has settled on that new track. But for small head movements to the next quarter or half track you need to have the 'foresight' as this could produce a continous RDDATA stream despite of the head being moved (spiral track and fat track protections do that).
As for writing to the floppy emu, once the write gate signal comes, you need to grab the bit stream coming from the DISK II controller and put it into the track buffer at that point in time. You can ignore the splices (no extra code) as long as you react to the write gate being turned off quickly (use interrupt).
One issue you will inevitably see when you implement write mode is 'clock slip' meaning it won't work. This can only be cured of you phase lock your SPI clock to the WRDATA stream of the DISK II controller. I don't know if the ST32 supports this internally, but it certainly supports an external SPI clock being fed to its SPI peripheral. And all you need to do this 'PLL' is a counter which gets reset for each WRDATA transition and runs at a substantially faster clock than 2Mhz.
I don't have the number of bytes needed for such a track buffer in my head, but it's somewhere in the WOZ file specifications. I remember that had to increase it over their initial value because some copy protections use a slightly higher flux density than the 4us. Back in the 1980's copyprotection wars this was a show stopper for most 'copy' tools until somebody figured out how to make the RPMs of the floppy drive programmable by the copy tool. In one case I remember on the Atari, they were able to squeeze 21 sectors on a track which normally held 18 sectors. The mastering machines did that by means of their programmable clock, but the 'pirates' defeated this by slowing down their spindle speeds for such tracks. And the copyprotection wars went on and on ...
If you want to support such a protection faithfully on your floppy emu, you also must be able to change the SPI clock speed in fine increments, such that the larger track buffer fits into the same revolution time period. Note that this is a weak attempt ... I can't say if it's good enough for all such protections out there. But it's a start. Mastering machines such as the FORMASTER had programmable clocks which could change along the tracks. Which was used for copy protections involving 'density frequency modulation' techniques. Which can be used to several effects. If done gently, the software could check that different sections of the track would take different times to read but all the sectors themselves (and the format) would appear to be normal. Think of a 'rubber band' being the track, with black rings written on it for the bits. You could stretch the rubber band but the information as such would stay the same. Only the bit rate seen by the FDC would go down. And this can be measured in terms of CPU cycles. If done rudely, the 'density frequency modulation' technique could produce 'instable' sectors inevitably having a CRC error, too (WDC controllers, not Apple DISK II) but the 'instable' bit pattern would not be entierly random and could be analyzed the check the authenticity of the floppy disk. I don't know if this technique has ever been used in the Apple II, but I think it's possible to use that technique to insert or delete one (or more) bits in the bit stream in a defined region of the track. The actual position of the effect within the track read would vary depending on the RPMs and some jitter effects, but if designed properly, the bit stream before and after the effect (the 'instable' bits) could contain stable information which could be utilized for the copy protection check. Back in the 1980's copyprotection wars, we defeated this scheme by providing several incarnations of the sector in question on a track, so the copy protection check would be fooled into thinking it's genuine.
I could go on and on, but all I wanted to do is to give you some ideas on what can be done in a floppy emu using SPI and what can't be done. If you wanted a floppy emu which could do everything then you need to add a 'flux engine' which allows to produce arbitary read pulse distances with a very fine resolution. These things are expensive to do, even today. The cheapest solution is my 'Ratweasel' but it needs the fully processing power of a PC or notebook with a parallel printer port in ECP mode, and using DMA. The track buffer size for this exceeds the RAM of the ST32.
So my advice here is that you should not try to 'reach for the moon' and be able to do everything in terms of copy protections, you do not need a flux engine, just be good enough for most cases of WOZ files now in existence. I think that they still lack the support for these more elaborate schemes and whatever you will find in terms of WOZ files out there on the internet is all you need to be able to support with your ST32 SPI based floppy emu.
I'm really hoping your project will succeed because I need a 'cheap' floppy emu for my Replica 2e project. For my development work I use the BMOW Floppy Emu which I think is a great product for the price (I'm not affiliated to them, mind you, and paid the full price for the mine) but I think it's too expensive for the Replica 2e ... my price target for the whole kit including PCB is below the price of the BMOW Floppy Emu alone. And I think it should have a floppy emu on the motherboard because otherwise builders would need to find functional 5.25" 40 track floppy drives and real floppy disks and this now costs even more than the whole BMOW Floppy Emu does. No pun against it intended ... it's a great product, and I'm fully satisfied with it so far ... but I need a cheaper solution. Just as a price socket, the "Ratweasel" could be built for less than $10 in parts. But this means you would need a notebook computer with the required parallel printer port and ECP mode done with a compatible "Super I/O" chip inside, as I don't have the time to write drivers for every possible such notebook, and furthermore, once you have a powerful notebook like that, with all the WOZ files loaded on ts hard disk, why not also load an Apple II emulator to play these games ... no Replica 2e needed anymore !
So your new floppy emu is the only hope I have now. I wish you have success ! This is why I write so long posts in your thread. I want that you do it right and want you to succeed !
- Uncle Bernie
Thanks UB, I will try to follow your approach and to do the right thing.
I better understand the approach of the WOZ Bitcell, I need to check the size of a track in the exemple of woz file provided,
If I use a circular buffer I can easily split the buffer in 2 and manage to have 3 byte buffers for the 3 adjacent track and it shoudl work. I need to do very simple math.
Ok for the 74x121/2/3, this is also what I thought (and I will use a clock timer output from the STM32 to make it happen). I will do some test tomorrow (I do not have these chip).
I think I know how I will manage the byte buffer switch for track change.
Where I am not 100% sure is the Step motor change algorithm.
I would like to start by fixing this algo
I use GPIO port interrupt on the STM32 to compute head move based on the Step0->3 change.
For the moment, the GPIO interrupt is triggered on positive edge only, which means that moving from 1->0 I do not have the interrupt (and i start to think this is an issue).
STM32 GPIO
I use PB8, PB9,PB10, PB11 because There are 5V compliant which is not the case for all the GPIO Port and I need adjacent port to ease with the computation.
I have 2 SPI : One for the SDCard SPI2 and One for the RDData output SPI1.
Device Enable is used to enable of disable bitstream output (even if I have a 74LS125 that is doing the 3 state buffer)
for the moment I use a NIC file with the Apple II and I record the interrupt in a buffer and I output the buffer to UART after a period to not create delay on the interrupt.
This is the output of the Step motor change interrupt:
Stp is the GPIO Step Bit configuration (on GPIOB I do a right bit shift of 8 : stp = (GPIOB->IDR >> 8 & 0b0000000000001111);
newStep is the XOR from stp & prevStep
prevStep has the newStep value from the previous cycle
Ph_track is the phyisical track with value from 0-70, (I have the feeling that i have to manage 0-139 ?)
the next 2 byte are prevStp and newStep
move is the head move of ph_track
The algorithm is :
Based on what i read, I do not manage half or quarter track...
What I am missing is the head move based on GPIO pin configuration :
Question 1: 0011 to 0110 what is the head move ?
Question 2: Do I have to manage negative edge and to catch in the Question 1 case the 0011 -> 0010 -> 0110 -> 0100 ?
Question 3: Why is the Disk controller keep sending step motor interrupt while track is already at 0 ?
thanks
Vincent
Reposting the output of the interrupt buffer
interrupt buffer
Answers:
1.to question 1: The head is positioned on a quarter track and will move half a track up/down.
2. to question 2: Yes, all changes should be captured. So adjust the STM not for edge but for level triggered interrupts;
3. to question 3: There are no interrupts of any kind issued by DISK II controller. During init it moves the head "past" position 0 in the track number decreasing direction and the drive's head carriage hits mechanical stop producing typical for Apple2 sound at that position many times because there is no other way to be sure the drive's head reached track 0 (no feedback sensor for that in the drive).
Just an idea -- it would be useful if you implement a sniffing/debugging mode for the STM to passively listen and display the current track of another working DISK II drive connected to the controller.
Ok thanks,
I found a document Software control Disk II or IWM Controller with at the end timing and 1 exemple.
Before I start to write the head move algorithm, does it exist somewhere and one I can use ?
Thanks
Vincent
In post #19, 'VIBR' wrote:
" Before I start to write the head move algorithm, does it exist somewhere and one I can use ?"
Uncle Bernie advises:
Your best bet is to: "use the Force and read the Source" of any capable Apple II emulator. To support floppy disk emulation from DSK and WOZ files they all need to emulate the 74LS259 addressable latch which keeps the four stepper motor phases. And then they decode these phases to determine movement of the head and to which track. You do not need to re-invent anything here. Just copy their code. But make sure it supports half and quarter tracks.
The "Tome" book also explains how these stepper motor phases work in the context of copy protections. But why reinvent the wheel ?
- Uncle Bernie
Thanks
I have ordered a copy of Tome, I will receive it today and I have now a 74LS122 & 74LS123.
Vincent
Ok using 2 2DArray for direction and listing Magnet position I manage Head moves from 0 to 139 => meaning I can manage quarter track with >> 2 bit shifting.
Ok now moving to load WOZ tracks to memory
UB said : "You need to do something to "wrap around" the DMA to the beginning of the track buffer once a complete revolution of the simulated floppy disk has happened"
Can you please elaborate a bit more on Wrap around please ?
Vincent
Your firmware should start outputting data from the beginning of the woz track buffer if possible ideally without any noticeable delay.
There must be a bit counter of the current track. When a track change occurs then the outputting of the next track should start from the bit counter+1. Or, in the worst case from a bit counter+T/bittime, where T is a fixed time that your firmware needs to switch tracks. Try to keep T as short as possible.
I used to sucessfully format the 36th track: track 35 [
$22]
using a Disk ][.https://forum.vcfed.org/index.php?threads/any-apple-ii-5-25-drives-do-40-tracks.76382/
OK I am making good progresses with moving away 4xBuffer size to manage empty space,
I successfully manage to create a 1us data stream from a 4us SPI DMA output using :
- 74LS06 to inverse the signal
- 74LS123 with 22K resistor & 100 pf capacitor, using SPI clock (inverted as well)
Clock seems to be ok,
I manage to use a plain buffer straight from the SDCard nibble. I need to do some check to see why I can not load the splash screen of Boulder Dash II (whereas it was working with the previous buffer management).
Now the real stuff starts with the WOZ file
From what I read a WOZ full track is 6 KByte
If I need 3 track => 19 K it will not work,
Using a circular buffer I can cut track in 2 and it should work
Note for UB: I received the book TOME, lot of code pages....
Vincent
Just in case you didn't find a good example for stepper motor phases to track conversion yet, look at the AppleWin-1.30.12.0 source code, file 'Disk.cpp', and search for 'magnet'. This is the last version I have downloaded, but there are newer ones, too:
https://github.com/AppleWin/AppleWin/releases
It's good that you got the "Tome" book, it will help you a lot to understand Apple II copy protections, so you can support them properly. You can download all the source code in the book from their web page, it's not easy to find where and how, and I don't remember how I did it, I just remember it took me more than a few mouse clicks to find it.
A 1us pulse width is a bit too low for guaranteed proper edge detection at a 2 Mhz clock, I recommend to use 1.3us, this will also provide guard bands for tolerances of the 74123 and the R and C on it.
The "Tome" book also has a lot of code for inspection tools inside which may help you with diagnosing problems. I did not try them out yet (still busy with developing my IWM substitute) but sooner or later I may use these tools to debug my various floppy disk controller solutions.
I don't think you need three 6 kByte track buffers because this not how the protections were done. So far I have found no combination of spiral tracks or fat tracks with higher bit cell densities combined. So unless you have a spiral or fat track copyprotection, one track buffer should do, because any head movement between regular tracks will cause loss of sync anyways and the whole thing gets less timing critical. But still, you need to be able to load a track buffer fairly quick when the head moves. There are copy protections which are based on "synchronized tracks" aka "skew aligned tracks", there are many names for it, and these critically depend on the RDDATA stream of the final track being there once the head stopped moving and ceased vibrating. This head settling time is in the order of several milliseconds and you don't have any more time than that to load and restart the new track buffer. So it will also depend from which type of flash card you pull your data. It may be necessary to keep pointer information for each track in the WOZ file in a small buffer, just to be able to access the data on the flash card fast enough without going through all the FAT motions.
- Uncle Bernie
A decent disk II drive emulator handling woz images should have been done with a PC's LPT port. I am still tempted.
Keep in mind that 3.5" drives use "double data rate", e.g. about 0.5uS pulses/2uS period.
thanks Uncle Bernie, I have a pretty good (ultra fast) stepper motor function inspired by Steve Emu
in the STM32, I have a pointer related to data left to be transfered and I think I can pause switch on buffer and resume the transfer.
Using Circular buffer I can manage 3 Half track in //. so It means that I will have to load 4 x a quarter track to the circular buffer.
Ok I will extend the Pulse width to 1.3 us and do some testing.
Vincent
In post #24, 'retro_devices' wrote:
" Your firmware should start outputting data from the beginning of the woz track buffer if possible ideally without any noticeable delay."
" There must be a bit counter of the current track. When a track change occurs then the outputting of the next track should start from the bit counter+1. Or, in the worst case from a bit counter+T/bittime, where T is a fixed time that your firmware needs to switch tracks. Try to keep T as short as possible. "
Uncle Bernie comments:
This well meant advice is wrong and confusing. What you really need is a timer within the ST32 which runs ALL THE TIME and has a prescaler setting such that it counts up and 'rolls over' to zero for each full revolution of the emulated floppy disk. Choose it such that you can easily calculate a "byte index" into your track buffer from the state of that timer. After head movement, use the timer to calculate the byte index for the position in the track buffer of the new track on which to start the DMA to the SPI.
The 'wrap around' is a tricky problem to solve. Once the bit stream reaches the end of the track buffer, the DMA must be reloaded to continue with the position in the track where the RDDATA pattern 'closes the circle' and repeats itself. This is evident from the way the floppy disk rotates. The data stream is circular in nature. Hence, a ring buffer is what is needed. But DMA normally does not support ring buffers ... I can't give any advice there what the ST32 can and can't do. With my RATWEASEL floppy disk emulator (which is a flux engine) I have solved the problem by exploiting the FIFO in the Super I/O chips found in some notebook computers. This FIFO for the ECP mode of the parallel printer port was provided to allow software polled ECP operation without DMA. But when used with DMA, there is just enough data in the FIFO when the DMA done interrupt occurs to reload and restart the DMA controller with the beginning of the track, without causing a gap in the RDDATA stream.
I have not implemented support for WOZ files there and use synthetic flux patterns based on offline compilation of Apple II disk nybble data, so I also can't comment how to do the 'wrap around' for WOZ files properly. RATWEASEL is no floppy disk emulator (yet). But it can produce RDDATA patterns for every known copy protection, for every kind of GCR, FM and MFM modulations scheme, as long as a bit cell is not much smaller than 2 us. This is the speed limit for typical ECP ports. I did not explore that limit because it's fast enough by ECP spec for anything I do with it. I'm not interested in any higher density work. Just mentioning that to point out limits. The ST32 can't do what typical notebooks from the late 1990s / early 2000's can do in terms of storing and pumping data for a real 'flux engine'. The 'Greaseweasel', for instance, relies on fast USB to pump that data from and to a reasonably fast PC or notebook. With your approach you are limitied to 8 bit cells per byte of RAM you have, but for most Apple II floppy emulation work this should be enough, IMHO. The reason why Apple II does not really need the signal fidelity offered by a flux engine is that the DISK II controller has a fixed 2 MHz clock and no PLL. And the next reason is that unlike other computer systems of the time, Apple II had many different floppy disk drives - Apple used different manufacturers, and the clone makers did the same. This limits how "bold" or "daring" Apple II copy protections could be. A little bit of timing trickery is possible ... but this can be emulated without having a real flux engine to produce the RDDATA stream. But the analysis of such original floppy disks of course does need a flux engine. Trying to understand what is going on without having the accurate timing data on the RDDATA pulses coming off the real floppy disk is just too much guesswork. This is why the 'flow' to make WOZ files from a real copy protected floppy disk is beyond the scope of this thread. (See the 'Applesauce' project to learn more on that). We do not want to go there, we just want a good floppy emulator that is reasonably cheap to build, and anything more sophisticated must be added by higher level manipulations of track buffer contents, and not by providing (expensive) flux engine capabilities to the floppy emu itself. Always keep this in mind. Since the 'copy protections war of the 1980s' is over, we just need to address whatever we have at hand from that time. And I'm confident that your ST32 based approach can do that. If not, we use a 4AM crack. As easy as that.
- Uncle Bernie
I can easily change the SPI clock divider to increase the frequency and mange 500 ns read pluse. The only thing is to make sure the load time < read time
I will do some timing test with the CPU Watch dog cycle counter to see precisly what it gives
Ok I change the RC value and I add a schotchy diode to avoir reverse current on R/ext
Diode 1N4148
C : 10nf
R : 330 O
Now the Data pulse width is 1.375
Channel 1 (White) : SPI Clock pulse (2 us phase)
Channel 2: SPI Data pulse
Channel 3: 74LS123 Data pulse 1,375 Us
Channel 6: Clock pulse from sync timer (1us freq) to decode Channel 3 data and ensure no data shift
Screenshot from 2024-06-08 07-36-15.png
zooming on the logic analyzer, I have sme very strange glitches
the red channel is having up/down => strange because the Clock pulse is low (channel 1) , data is high (channel 2)
Why do I have this ?
Screenshot 2024-06-08 at 08.22.02.png
Quickly my schema
Screenshot 2024-06-08 at 09.06.31.png
Adding 100nf capacitors to the power line of 74LSxx seems to fix the issue, but still I have the Apple II not loading the splash screen like before...
progressing ;)
processed-DC6E5C40-3600-4D25-BB47-7D1686A29BE4.jpeg
My wire mess ;)
processed-6421B9E4-818E-487C-A50D-6F0CE80E2289.jpeg
The signals seen in your post #32 look right to me. Not sure from memory if you should route the /Q output of the 74123 to the RDDATA line. I'm always confused with that because Apple documentation is inconsistent as to the polarity of this signal. But as far as the negative edge detector is concerned, it will work either way.
Congrats for your progress seen in your post #36. BTW, there is a reason why professional hardware designers avoid these plugboards. They always have been troublemakers. For digital work, the power supply and ground rails are too poor and the long wires (being inductors in the nano-Henry range) if used to connect VCC//GND to the ICs make matters worse. For analog work, except if it very slow / low bandwidth, the parasitics of the plugboard will conspire against any good outcome. Any kind of RF work is impossible on plugboards. And fast digital logic work, too, because it's the same laws of Physics, transmission lines, impedance matching, etc.
So what I would recommend to you is that once you have figured out how to derive a useful SPI clock from the Apple II's Q3 and WRDATA signals (needed for the emulation of write to disk operations), build another prototype on a real PCB, preferably with a ground plane, or at least really solid VCC/GND traces and plenty of power supply bypass capacitors.
I've seen so many developers strugging with instable hardware built on such plugboards, wasting time with trying to hunt down elusive gremlins, even modifying their RTL code to try to fix things ... and all that being a futile exercise and all the problems would go away by themselves when the plugboard is replaced with something better suited for the work.
Applefritter member 'frozen signal' who is working on MMU / IOU substitutes based on fast CMOS CPLDs, IMHO, is one of those victims of plugboards. They are legion. I will not tell you all the electronics industry insider jokes about these plugboards, except for this one:
Manager walks through the company lab and notes down everyone who uses a plugboard ... to be put on the next layoff list.
Just sayin' ... I remember when these evil contraptions came out more than 50 years ago and at first they looked like a great idea to save time, so people bought them. And then discovered all the ill effects. Not only the parasitics ... contacts would wear out soon and they would scrape off tin/lead coating from component leads, and this debris would then accumulate and vagabond around inside the plugboard and cause random short circuits. And so on ... and after these lessons were learned, the plugboards were tossed where they belong, into the trash can. Half a century later the Chinese made knockoffs with even worse contact spring quality get peddled to unsuspecting hobbyists. Oh and don't get me going on those colleges and universities who force their students to build a TTL based CPU on larger plugboards. It's just stupid to do that ... unless the purpose of the mission is to make the students hunt down random contact issues.
- Uncle Bernie
Almost there,
I read woz, TMAP, Cluster track,
I inject the track in the DMA,
I can see the track changing but nothing on screen...
Just to speed you up on the write mode, which is not yet supported in your schematic, here is how to do it right:
As I mentioned before, you need circuit which will generate a SPI input clock that is appropriate to capture the WRDATA bit cell stream. In the DISK II controller card, this is coming from the Q3 clock, which is ~ 2MHz (actually, 14.3181818MHz divided by 7). WRDATA may toggle every 8 Q3 clocks, but it does so only in case of a "1" bit cell. You need to capture the WRDATA bit stream in the middle of the bit cell by SPI. And I suspect even the ST32 SPI does not have the capability to do that without some external help.
What you need is a oneshot (use the second one in a 74123, the first one is for the RDDATA pulse width) and set it up such that it will generate a short pulse (longer than the minimum "asynchronous clear pulse width" of the counter you will use) for either a rising or falling edge of WRDATA. Use this pulse to clear a binary counter. Clock this counter with some clock that is derived from the ST32 clock (maybe the ST32 can generate such clock on some pin) which should be 1 MHz or more. Size the counter such that it can count at that clock for at least 4 bit cells (16 us) before it rolls over. Use one of the outputs of the counter which rises in the middle of each bit cell to clock the SPI. Data input of the SPI would be WRDATA (in the simplest case).
I think a 74xx393 could do that (it has 8 bits and an asynchronous clear input).
If you use the simple case, you need to correct the data bytes written into the track buffer, because WRDATA will change for each "1" being written, so you need to detect changes (current bit vs previous bit) to arrive at the correct 'floppy image' bit pattern. I think this can be done by a shift and EOR operation with handling the byte boundaries correctly. But I can't say if that would not turn into a kludge - you would need to somehow keep track of which bytes were written and which were not. A 2nd track buffer which is initialized to all zeros could capture the written data once the WRREQ gets asserted. After WRREQ gets deasserted, you then could look for nonzero bytes in the 2nd track buffer and correct/copy them to the 1st track buffer holding the "read" data which includes the sector headers etc. which in normal operation never should be overwritten - unless the whole track is getting formatted. The proposed method could handle that without any exceptions to be coded.
Those bytes in the 2nd track buffer which stay zero would mark those regions of the track in which the write gate never was activated. So it's quite elegant.
You could dispense with the code needed to do the correction (which is turning 'changed' bits to '1') by adding a WRDATA change detector in the external logic. Whenever WDDATA changes, you need to present a '1' to the SPI input. This can be done by capturing the old WRDATA state and compare it to the current one. You could also use a oneshot which would trigger at both the rising and the falling edge of WRDATA, and run for 3 us, so at half the bit cell time, when the SPI clock happens, the ST32 would 'see' the '1' indicating that WRDATA has changed.
All this external logic could also be put into a small PAL or GAL and with the appropriate clock you would need no oneshots, and it would support both the READ and the WRITE mode. If you want to do that, but don't have the tools, I could design such a GAL for you, just send me a PM.
Note that all operations to and from the track buffer must be governed by the timer I mentioned in some post above, which rolls over once per floppy disk revolution, and is always used to calculate byte indices into the track buffers where to start / stop operations. This timer is set up once and otherwise is left alone. So during the whole emulation session, this timer runs and runs and runs and is never reset, even when tracks are changed. This is very critical to get the timing right for all the copy protections looking for synchronized tracks. A refinement may be needed for those copy protections which use changed bit cell data rates, this may entail a change of the SPI clock frequency and a different factor to calculate the byte index from the timer. But the timer as such would never be reconfigured. It is not necessary to be hyper accurate there ... as long as you are accurate to +/-1%, for which a 8 bit timer is enough, you should be good.
(If some readers now suspect that I already have designed and built a floppy disk emulator myself, they are right, how else could I know how to do it. But this was 30 years ago and not for the Apple II, which has its own rules deviating from the "industry standard" floppy disk interface. Hence, take the above with a grain of salt ... I did my best to take the Apple II idiosyncrasies into account in this post, but can't guarantee that all the above is flawless and would work as described for Apple II floppy disk emulations).
- Uncle Bernie
I have implemented the 3 tracks management and I am struggling with many segfaults...
the circular buffer is 512 bytes size and I need 7 buffers
One for the DMA
3 for the current data block within the 13 datablock of a track
3 for the next data block.
Crash origin might be:
- getting 512 bytes from the SD using STM32 HAL (High Abstraction Level) is consumming 340 K CPU cycles;
- I need to get 3 data block to manage the 3 tracks
- Sending a data block over DMA is 512*8*4us * 64 cycles = 1 M cycle;
=> It means that the DMA has finished and the read of the SD is not over.
I need to optimize the SPI read function before moving ahead.
Ok confirmed the reason of th crash is the DMA is requesting data before the read of the SDCARD
SDCard reading is now 190K CPU Cycles
140 K cycles are pure wait of SDcard to be ready....
so I will increase the read buffer and have a different approach otherwise it will not work.
EDIT 2 : Using CMD18 instead of CMD17 I can read multiple data bloc without multiple SDCard init time.
EDIT 3 : Now I can read 4 Blocs in 193 K CPU cycles (still 140 K Cycles waiting for the SD).
I have optimize the SPI function to move away for High Abstraction Level gaining 1/3 of cycles
I will adjust the way I recharge the buffer and increase the buffer size ...
I made some tests, and my conclusions:
- Loading partial track to memory from SDCard works with CMD18 but requires multiple block transfer (mini 4 block of 512 Byte) to avoid waiting time with CMD17
- Playing with partial track in memory and circular buffer for DMA SPI is giving quite complex coding, and this is due to the limitation of the STM32F103 with only 20k of RAM ( 3 track requires 3x13 blocks of 512 Bytes = 19968, no room for other RAM need)
- I would go for STM32F401CCU6 with 64 kBytes of ram so that I can easily manage 3 full tracks on memory.
to be continued...
Hello UncleBernie,
I have almost finished the code with the STM32F401 that I should receive in the coming days.
Thanks for your detailed explaination of the Writing process, but I do not get it sorry,
I understand the use of the 74LS123 (channel B) to trigger a clock on the stm32 side. I assume that the clock will clear the 74LS393 buffer after 16us ? after that I am a bit lost.
I would need a step by step to understand the way the SPI RX will be fed
Thanks
Vincent
EDIT 1: I have also one thing I do not get between NIBBLE File and WOZ, When look at the binary content with an HEX editor of the a NIC file I see the 10 Bits signature, the D5 AA 96 prologue and so on. but I can not see that with the woz why ? it should be have the same data expected by the disk (sorry if the question is stupid but I do not understand)
Because woz file difines the time between flux transistions (pulses). You mentioned you read and uderstood woz file specification, didn't you? It must be easier to "play" woz than other formats.
Well, I am not sure I fully understand woz format, i need some help please
The full specification is here https://applesaucefdc.com/woz/reference2/
OK I think I start to understand the circuitry and the logic of the SPI clocking and WRDATA write process,
just to rephrase :
- WRDATA is connected to SPI RX,
- STM32 is fulling the counter clock at a speed of 1 Mhz with a PWM Timer,
- SPI CLK is triggered at the 4th count of the counter and the counter is cleared
Retro Devices, thanks I have read the spec for the Woz 2.0.
What I do not get properly :
On one hand I send NIBBLES directly to the RDATA pin with the NIC file on the other hand I send bitstream of the WOZ file and it gets decoced by the logic sequencer. How the disk is making the difference and chosse to decode or not ?
Sorry if it is stupid question or if I miss something
Vincent
This is version 2.1 of the specs, not 2.0. You have several different disk image formats that pack data in different ways. It is the emualtor to distinguish and treat them accordingly, not the Apple2.
OK I used a V2.0 file and I can not find the equivalent signature of the NIBBLE Inside the data of the track
Read again the first 3 senteces of this message https://www.applefritter.com/comment/108578#comment-108578
Pages