• You are currently viewing our forum as a guest, which gives you limited access to view most discussions and access our other features. By joining our free community, you will have access to additional post topics, communicate privately with other members (PM), view blogs, respond to polls, upload content, and access many other special features. Registration is fast, simple and absolutely free, so please join our community today! Just click here to register. You should turn your Ad Blocker off for this site or certain features may not work properly. If you have any problems with the registration process or your account login, please contact us by clicking here.

How would you approach this technical problem?

ygolo

My termites win
Joined
Aug 6, 2007
Messages
5,988
Suppose you work for a technology company that makes chips (integrated circuits), and the following task was given to you.

Make sure that a chip design (already manufactured, and not designed by you) will function for customers at frequencies of F_low to F_high inclusive.

Complications (normal)
  • Initial testing shows some issues at frequencies between F_low and F_high at certain process skews, voltages and temperatures. However, workarounds have been found for many of the issues, and for most of the rest, it was found that the customer does not care about the particular set conditions that makes the parts fail.
  • However, at F_high, the design seems extremely marginal. All sorts of features fail under seemingly random conditions.
  • You are allowed to come up with "solid workarounds" to make sure the parts going to customers will work with the "solid workarounds" in place. It is too late to make design changes to the chip itself. However, you can change software, firmware, external resistors, capacitors, cables, and so on, as long as the changes don't cause the customers to reorder parts or rewrite their software.
  • You must come up with a "screen" to test parts to make sure they will work when the customers receive them...and the parts must last
  • Many of the registers and pins have secret functions that you can only find out by asking the right person.
  • The design itself is a closely guarded secret, but if you know the right people to ask, you can still get access to most of it without breaking any rules.
  • The design is a complicated mix of analog and digital circuits.
  • There is ample data collected on the problems, but very few conclusions, if any. Data collection is easy, making sense of the data at F_high is proving to be near impossible.

Further complications(not normal):
  • You are part of a small group of people on this task
  • Many of your peers and superiors seem to have given up, but still expect magic solutions
  • Customers are already using parts and somehow that has to be managed (quiet recalls/part swaps, etc.)
  • F_high has to work in two weeks, despite others having spent 3 months trying to make it work

Please brainstorm!

I would like outsiders' views on the situation, but I cannot give many details due to my confidentiality agreement. Sorry. I would love to get technical, but this has to stay abstract.
 

Valiant

Courage is immortality
Joined
Jul 7, 2007
Messages
3,895
MBTI Type
ENTJ
Enneagram
8w7
Instinctual Variant
sx/so
Easy, I bring a gun to the office and force everyone to cooperate in order to find a solution.
When they are done and successful, they can go home to their loved ones.
If they ever tell anyone or shed any negative light on me or even mention the incident, I will kill them before they can stand trial.
 

Little_Sticks

New member
Joined
Aug 19, 2009
Messages
1,358
...
[*]You are allowed to come up with "solid workarounds" to make sure the parts going to customers will work with the "solid workarounds" in place. It is too late to make design changes to the chip itself. However, you can change software, firmware, external resistors, capacitors, cables, and so on, as long as the changes don't cause the customers to reorder parts or rewrite their software.

...

[*]Many of the registers and pins have secret functions that you can only find out by asking the right person.

[*]The design itself is a closely guarded secret, but if you know the right people to ask, you can still get access to most of it without breaking any rules.

...

[*]There is ample data collected on the problems, but very few conclusions, if any. Data collection is easy, making sense of the data at F_high is proving to be near impossible.
[/LIST]

Further complications(not normal):
  • You are part of a small group of people on this task
  • Many of your peers and superiors seem to have given up, but still expect magic solutions
  • Customers are already using parts and somehow that has to be managed (quiet recalls/part swaps, etc.)
    [*]F_high has to work in two weeks, despite others having spent 3 months trying to make it work


...

That doesn't really sound like a problem, just a lot of sleuthing and *cough* meticulous work.

I don't get it. If it needs to work in two weeks, then why don't they just give you all the information about what it is you're working with?

All I can say is to take what you have and organize so that it can solve your 'problem' and nothing else (I know that's vague but we don't have much to work with); so create your own conclusions with what you have got, I suppose?

NEED MORE INFORMATION

Maybe an ENTJ will jump in and give you some kind of Te system to follow.
 

JAVO

.
Joined
Apr 24, 2007
Messages
9,178
MBTI Type
eNTP
Start with changing the software. Besides, if you can't do the fix there, it sounds like you're mostly screwed for now anyway. :D
 

ygolo

My termites win
Joined
Aug 6, 2007
Messages
5,988
That doesn't really sound like a problem, just a lot of sleuthing and *cough* meticulous work.

Why did you cough? Do you look down on meticulous work?

I don't get it. If it needs to work in two weeks, then why don't they just give you all the information about what it is you're working with?

It's all about knowing who to ask. I have the "clearance" needed, just need to figure out who has what information.

All I can say is to take what you have and organize so that it can solve your 'problem' and nothing else (I know that's vague but we don't have much to work with); so create your own conclusions with what you have got, I suppose?

I do draw my own conclusions, but we are all on the hook for this. There is supposed to be a 'task force' on this, but the members seem to be bickering and a few have decided not to talk each other. The leader has also stopped holding a regular meeting. I may nudge him to start holding the meetings again.

NEED MORE INFORMATION

I know you do but I can't give it.

Maybe an ENTJ will jump in and give you some kind of Te system to follow.

You mean something like this:
Easy, I bring a gun to the office and force everyone to cooperate in order to find a solution.
When they are done and successful, they can go home to their loved ones.
If they ever tell anyone or shed any negative light on me or even mention the incident, I will kill them before they can stand trial.


Start with changing the software. Besides, if you can't do the fix there, it sounds like you're mostly screwed for now anyway. :D

Yeah, most of the fixes will be in the software.

There is some OOP, but it's not done well, and I certainly am not going to fix it.

The software is poorly managed too. It barely has revision control even.

Luckily, the software is actually fairly simple--just reading and writing of control bits mostly...very little in the ways of threads or even loops for that mater.

The big question is of course what to put into the software. Which control bits to set? In what order?

You are probably fuxxed.

I am looking for a new job anyways.
 

Aleksei

Yeah, I can fly.
Joined
Mar 10, 2010
Messages
3,626
MBTI Type
ENTJ
Enneagram
7w6
Instinctual Variant
sx/sp
I would probably find a way to manipulate my coworkers into solving the problem for me.
 

millerm277

New member
Joined
Feb 1, 2008
Messages
978
MBTI Type
ISTP
Ah. Same field as me (I work with GaAs stuff), except I'm an intern, so I am not especially useful yet, just a good lab tech...

I would say: You'd better think hard and start trying to find patterns/sources of the failures, and that the issues you're really facing, seem to be coming from the people and structure of the company, not the chip. When getting the documentation is more effort than the work, I'd say you have a problem.

Nonetheless, you still seem screwed for the moment. Good luck.
 

Katsuni

Priestess Of Syrinx
Joined
Aug 22, 2009
Messages
1,238
MBTI Type
ENTP
Enneagram
3w4?
I don't know a ton about this area, but there's some standard troubleshooting rules that can be applied to quickly reduce the possible problem areas.



You are allowed to come up with "solid workarounds" to make sure the parts going to customers will work with the "solid workarounds" in place. It is too late to make design changes to the chip itself. However, you can change software, firmware, external resistors, capacitors, cables, and so on, as long as the changes don't cause the customers to reorder parts or rewrite their software.

As we know that the basic design can NOT be changed, that means that only external means can be affected, which's problematic. It may end up meaning that this is not actually possible to complete with the tools provided, regardless of how well yeu attempt to do so. However, there's usually a solution to any problem, the main issue is to focus on whot the actual problems are, and isolate their possible causes, then deal with those individually.

The problem here, is that yeu're going to need alot of trial and error, it's highly unlikely that the multitude of errors that occur at F_high are entirely related, most likely they're bunched in small related groups but that these groups themselves are unrelated to each other; this should make things easier as if yeu cure one problem, likely several shall vanish at a time. The end issue, is that yeu are going to need time to isolate each issue and devise a workaround for each one, which doesn't interfere with previous workarounds. The more 'creative' workarounds yeu have built into the design to make it work, however, the harder this becomes, as each new 'fix' may break a previous workaround.

I would seriously suggest avoiding messing with the software too much, as it has far too high a chance to inadvertently affect the previous workarounds, and leave yeu with an even less stable design as yeu try to correct problems that were already corrected previously, making yeur time spent redundant and not having actually created anything of progress.

As with ANY other troubleshooting problem... the first few steps are ALWAYS to identify the problems there are, isolate the possible causes, and test each cause in the order of plausibility, bleeding into the order of largest number of possible failures reduced at a time.

Yeu want this done quickly; ie 2 weeks. This means yeu need to be efficient in the problem solving department. Due to this, focusing solely on one problem at a time isn't going to work. Work on each individual flaw by doing the most likely errors, and then by the ones that can fix the most per attempt. If yeu get stuck on a problem, move to the next one and don't linger; come back to it later if yeu have time.

This isn't so much a 'fix everything in 2 weeks', as it's unlikely yeu'll be able to succeed. The things yeu CAN do, is allocate yeur time wisely and try to fix the largest number of things, and the most important things.

As such, focus yeur initial efforts on the features that are most commonly used, and most essential for use that are failing at F_high. If these fail, then the chip is pointless to have in the first place.

The information yeu've given is far too vague for anyone to give accurate ways to fix any of the problems, but setting up a troubleshooting method and having everyone on the team working on it to follow those steps should greatly accelerate the work yeu can do in the time yeu have.

Another problem is allocating members; yeu have several people working... I'd highly suggest citing everyone a particular problem to work on, but starting everyone on a main problem instead. If people find that they feel like a 5th wheel or aren't really needed, then they defer to their individual problems to focus on rather than waste time standing around doing nothing. If someone needs them for the main issue, then they can be easily found and go back to work there since they'll all have familiarity with it. This should further help aid yeu in the time management issue.

So yeah... yeu have two main factors working against yeu here; the possibility that some of these errors may not be possible to fix without damaging other workarounds, which may not be possible to fix short of a chip redesign which yeu can't do so yeu may be screwed from the start... and secondly, yeu only have a set amount of time to correct several different problems. It's LISTED as one problem "make it work", but there's several sub-issues which are related which need to be addressed, in that yeu mentioned it can fail several different ways.

As yeu can't necessarily deal with the former issue, I'd advise working on yeur time management more than anything, to get the most important, and the largest number, of things fixed, rather than trying to fix 'everything'. If yeu try to fix everything, it's far too likely that people will fall into a mindset of trying to come up with a single magical cure that fixes ALL the problems, and will waste their time looking for the elusive macguffin.

Keep people organized and manage time efficiently, and that's yeur best bet.
 

Fluffywolf

Nips away your dignity
Joined
Mar 31, 2009
Messages
9,581
MBTI Type
INTP
Enneagram
9
Instinctual Variant
sp/sx
Pardon me, I am not very knowledgable about chips and their inner workings. But I'll humor you.

With a mix of analog and digital methods of storing and sending information, I assume that there may be issues with queueing signals or latencies? Is there any way to test for that? Finding out which circuits mess up in latency and put more resistance on the circuits that are going 'too fast'?

A chip is only as fast as its weakest link! Either find a way to make the weakest links faster, or slow down the faster links to match the weakest?

Anyhow, like I said, I am just humoring you. I have no idea if anything I just said makes actual sense, due to my lack of knowledge about chips.
 

entropie

Permabanned
Joined
Apr 24, 2008
Messages
16,767
MBTI Type
entp
Enneagram
783
Changing an organisational hierachy is a bitch and if you have many parties involved cooking a bit of the main product, you have the classical miscommunication in parallel organized development, leading you to the point at which it is more work to figure out where things went wrong than to start all over.

If it's only magic that can help you, I'ld advise you to be honest with your customer and superiors and tell em about the flaws that cant be fixed, so you dont risk a lawsuit. You should at this point watch out for yourself, so you dont get dragged into taking responsibility for things other people have fucked up.

You should think about a quick master plan about where things can be wrong with the circuit within the limits of the things you actually can work on. And then allocate tasks to the particular member of your team, following the master plan that with the help of your team you can run some more tests on the chip.

In the end then you should give a detailed report on the tests you ran and either be able to have fixed some problems or tell that due to you not being able to change the hardware design, you couldnt pinpoint a special cause.

Especially with a thing like to keep a stable frequency in a ship, there are like a billion things that could have went wrong in the first place, which is like impossible to be solved in two weeks ( ranging from the material used, to magnetic distortions in the chip over hardware and software designs )

Your supervisiors shouldnt put you in a position like that, except they know you really can do magic.
 

ygolo

My termites win
Joined
Aug 6, 2007
Messages
5,988
Thanks for the input guys.

I have been collecting my own data and it is slow going...and I am only seeing some basic trends.

We have at this point managed to pull a couple of other people to help work on the problem.

The biggest problem is that the chips are not consistently getting to a particular state and passing the "checksum" of bits on the communication channel for an appreciable amount of time for a wide enough set of voltages and other settings.

We still have a little over a week to figure this stuff out.
 
A

A window to the soul

Guest
Have you considered electromagnetic interference?

...just throwing that out there.
 

ygolo

My termites win
Joined
Aug 6, 2007
Messages
5,988
Indeed I have. In particular, there is a type of interference known as "crosstalk." I am very suspicious that we have a lot of it.

But I am not sure how to compensate for it, nor how to prove it (since I don't have access to an oscilloscope fast enough).
 

Katsuni

Priestess Of Syrinx
Joined
Aug 22, 2009
Messages
1,238
MBTI Type
ENTP
Enneagram
3w4?
Indeed I have. In particular, there is a type of interference known as "crosstalk." I am very suspicious that we have a lot of it.

But I am not sure how to compensate for it, nor how to prove it (since I don't have access to an oscilloscope fast enough).

If it's crosstalk, then it's probably the filter from whot little I know on the matter. If yeur signal filter is built into the chip as a physical constraint, then yeu're pretty much screwed. If it's a process, it may be possible to correct the issue. Best to look there to start with if yeu're able to. I'd imagine yeu have several chips to work with as test subjects, so try removing the filter entirely, if that's an option, and see how it affects yeur results.

Unfortunately I only vaguely know the basic components there and whot they do, not how integrated they are, so I have no idea if that idea is even remotely possible to achieve or test.
 

ygolo

My termites win
Joined
Aug 6, 2007
Messages
5,988
If it's crosstalk, then it's probably the filter from whot little I know on the matter. If yeur signal filter is built into the chip as a physical constraint, then yeu're pretty much screwed. If it's a process, it may be possible to correct the issue. Best to look there to start with if yeu're able to. I'd imagine yeu have several chips to work with as test subjects, so try removing the filter entirely, if that's an option, and see how it affects yeur results.

Unfortunately I only vaguely know the basic components there and whot they do, not how integrated they are, so I have no idea if that idea is even remotely possible to achieve or test.

There isn't a physical filter that can be removed on the chips. But the chips have "equalization" that can be adjusted through software. Unfortunately, the equalization is lane by lane--nothing that takes information from all the lanes to cancel out cross talk. So you can only affect things somewhat indirectly.
 

sLiPpY

New member
Joined
Oct 14, 2009
Messages
2,003
MBTI Type
ISTP
Enneagram
9w8
Instinctual Variant
sp/sx
I'll simply comment on the (not normal) part: There isn't a single billionare that won't tell you...it wasn't how they handled the best of possible situations. It was in facing the worst and giving their best.

The success of the project doesn't depend on others. You're asking the questions, it depends upon you.

This is a rare opportunity for your finest hour.

No one else on the whole wide planet can give that to you.

Pick your players and forward a proposal.

March on!
 
Top